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methods. A bacterial host cell transformed with DNA encoding R. Cvi JI is also disclosed as well as methods for expressing R. Cvi JI in 
the bacterial host system and subsequent materials and methods for purifying the enzyme. 
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DINUCLEOTIDE RESTRICTION ENDONUCLEASE PREPARATIONS AND METHODS OF 

HELD OF TFnr TNTvpa-rr^ 
The present invention relates generally to isolated purified 
polynucleotides which encode restriction enzymes and to methods of expressing 
the restriction enzymes from such polynucleotides. More particularly this 
invention relates to isolated purified polynucleotides which encode CviJl and 
related methods for the production of this enzyme. 

Other aspects of the invention relate to methods for partially or 
completely digesting DNA at a dinucleotide sequence. More particularly, this 
aspect of the invention relates to methods of generating quasi-random fragments 
of DNA, and methods of cloning, labeling, and sequencing DNA, as well as 
epitope mapping of proteins. The invention also relates to methods for generating 
sequence-specific oligonucleotides from DNA, without prior knowledge of the 
nucleic acid sequence of such DNA, and to methods for cloning and labeling 
DNA after restriction digestion by a two base recognition endonuclease reagent. 
This invention also relates to methods for cloning, labeling, and detecting nucleic 
acids using two base restriction endonuclease reagents, such as CviJ I, BsuR I, 
Aci I or CGase I. Further the invention relates to labeling DNA by taking 
advantage of certain properties of the holo-enzyme of thermostable DNA 
polymerases. 



B ACKGROUND OF TffF IMVF IVT™" 
Restriction endonucleases are a group of enzymes originally found 
to be expressed in a wide variety of prokaryotic organisms. More recently they 
have also been found to be encoded in viral genomes. These enzymes catalyze 
the selective cleavage of DNA at generally short sequences, often unique to the 
individual enzyme. This ability to cleave makes restriction endonucleases 
indispensible tools in recombinant DNA technology. The increased commercial 
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availability of the isolated enzymes has contributed in large part to the enormous 
expansion in the field of recombinant DNA technology over the last few years. 

These enzymes have been classified into three groups. Because of 
properties of the type I and type in enzymes, they have not been widely used in 
molecular biology applications, and will not be discussed further. Type II 
enzymes are part of a binary system known as a restriction modification system 
consisting of a restriction endonuclease that cleaves a specific sequence of 
nucleotides and a separate DNA modifying enzyme that modifies the same 
recognition sequence and thereby prevents cleavage by the cognate endonuclease. 
A total of about 2103 restriction enzymes are known, encompassing 179 different 
type H specificities (Roberts, et al„ Nucl Acids Res. 20:2167-2180 (1992)). 
Although there are more than 1200 type U restriction enzymes, many of them are 
members of groups which recognize the same sequence. Restriction enzymes that 
recognize the same sequence are said to be isoschizomers. 

The vast majority of type H restriction enzymes recognize specific 
double-stranded sequences which are four, five, or six nucleotides in length and 
which display twofold (palindromic) symmetry. A few enzymes recognize longer 
sequences or degenerate sequences. 

The location of cleavage sites within a palindrome differs from 
enzyme to enzyme. Some enzymes cleave both strands exactly at the axis of 
symmetry generating fragments of DNA that carry blunt ends, while others cleave 
each strand at similar sequences on opposite sides of the axis of symmetry, 
creating fragments of DNA that carry protruding, single-stranded termini. 

Restriction endonucleases with shorter recognition sequences cut 
DNA more frequently than those with longer recognition sequences. For 
example, assuming a 5095 G-C content, a restriction endonuclease with a 4-base 
recognition sequence will cleave, on average, every 4 4 (256) bases compared to 
every 4 6 (4096) bases for a restriction endonuclease with a 6-base recognition 
sequence. Under certain conditions some restriction endonucleases are capable 
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of cleaving sequences which are similar but not identical to their defined 
recognition sequence. This altered specificity has been termed "star" (*) activity 
and is observed only under certain non-standard reaction conditions. The manner 
in which an enzyme's specificity is altered depends on the particular enzyme and 
on the conditions employed to induce the star activity. Conditions that contribute 
to star activity include high glycerol concentration, high ratio of enzyme to DNA, 
low ionic strength, high pH, the presence of organic solvents, and the substitution 
of Mg + + with other divalent cations. The most common types of star activity 
involve cutting at a recognition sequence having a single base substitution, cutting 
at sites having truncation of the outer bases of the recognition sequence, and 
single-strand nicking. The following restriction endonucleases show star activity: 
Ase I, BamH I, BssH H, BsuR I, CviJ I, EcoR I, EcoR V, Hind m, Hinf I, Kpn 
I, Pst I, Pvu H, Sal I, Sea I, Taq I, and Xmn I. Star activity is generally viewed 
as undesirable, and of little intrinsic value. 

Of the 179 unique type n restriction endonucleases, 31 have a 4- 
base recognition sequence, 11 have a 5-base recognition sequence, 127 have a 6- 
base recognition sequence, and 10 which have recognition sequences of greater 
than 6 bases. In two cases, a restriction endonuclease has a recognition sequence 
of less than 4 bases. 

The restriction enzyme CviJ I has a three base recognition sequence 
or a two-base recognition sequence, depending on the reaction conditions. Under 
normal reaction conditions CviJ I recognizes the sequence PuGCPy (wherein 
Pu=purine and Py=pyrimidine) and cleaves between the G and C to leave blunt 
ends (Xia et al., 1987. Nucleic Acids Res. 15:6075-6090). Under "relaxed" or 
"star" conditions (in the presence of 1 mM ATP and 20 mM DTT) the specificity 
of CvU I may be altered to cleave DNA more frequently . This activity is referred 
to as CvU I*. for star or altered specificity. However, CvU I* activity is not 
observed under conditions which favor star activity of other restriction 
endonucleases. 
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The restriction enzyme BsuR I normally recognizes the sequence 
GGCC and cleaves between the G and C to leave blunt ends. (Heininger, et of. , 
Gene 1:291-303 (1977)). Under relaxed conditions (high pH, low ionic strength, 
and high glycerol concentration) the specificity of Bsu RI may be altered to cleave 
5 DNA more frequendy . An isoschizomer of this enzyme, Hae m, does not display 
this star activity. 

In bacteria, the restriction endonuclease provides a mechanism of 
defense against foreign DNA molecules (e.g., bacteriophage DNA) by virtue of 
its ability to distinguish and cleave only exogenous DNA, leaving endogenous 
10 bacterial DNA unaffected. Viral endonucleases possess the same discerning 
capabilities, but rather than providing a means for defense, this activity has 
presumably evolved to cripple the host's ability to replicate its own DNA and 
allows the virus to assume control of the host's replication machinery. 

Bacteria and viruses which express restriction endonucleases 
15 necessarily possess the inherent ability to protect their own genome from cleavage 
by their endogenous endonuclease. The primary mechanism by which this is 
accomplished is by modifying the organisms own DNA by, for example 
methylating a base in the recognition sequence which prevents binding and 
cleavage by the endonuclease. Therefore, to insure viability, the genome of an 
20 organism which expresses a restriction endonuclease is almost always heavily 
modified, usually by methylation of cytosine or adenosine bases. The methylase 
enzyme which modifies the genome (itself a useful tool in molecular biology) acts 
in tandem with the endonuclease, either as part of an enzyme complex 
(restriction/modification complex) or as two distinct entities. Therefore, 
25 recognizing that an organism expresses an enzyme with endonuclease activity 
strongly suggests the expression of an associated modifying methylase enzyme 
(and vice versa) and this association has led to isolation and cloning of a number 
of commercially available restriction/modification enzymes for use in the 
laboratory as discussed below. 
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One of the limitations in the use of restriction endonucleases exists 
when cleavage of a given sequence is required and no known endonuclease exists 
which is specific for that particular sequence. Therefore, the continued 
identification and isolation of unique restriction endonucleases and altered reaction 
5 conditions will allow for even more sophisticated manipulation of DNA in vitro. 

A number of publications and patents describe the cloning of DNAs 
encoding restriction endonucleases. Included among theses publications is Kiss. 
A., et al., Nucleic Acid Research 13:6403-6421 (1985), which describes the 
cloned nucleotide sequence of the BsuBl restriction-modification system isolated 
10 from Bacillus subtillis. This system is specific for the sequence 5 '-GGCC-3 ' and 
is defined by two gene products which are transcribed by different promoters. 
The methylase component of the system shows homology to the methylase from 
the BspRI and SPR restriction-modification systems. 

Nwanko, D.O. and Wilson, G.G. Gene 64:1-8 (1988), describe the 
15 cloning and expression of the Mspl restriction and modification genes isolated 
from Moraxella sp. This system recognizes the sequence 5 '-CCGG-3 ' and both 
enzymes are functional in E. coli. Evidence indicates that these genes are 
transcribed in opposite directions, thus are probably under the control of different 
promoters. 

20 Ashok, K.D. , et al. , Nucleic Acids Research 20: 1579-1585 (1992), 

describe the purification and characterization of cloned Mspl methyltransferase, 
over-expressed in K coli. At low concentrations the enzyme exists as a 
monomer, but at higher concentrations it exists mainly as a dimer. Polyclonal 
antibodies to the enzyme cross-react with methyltransferase genes of other 

25 modification systems. 

Brooks, J.E., et al. Nucleic Acids Research 19:841-850 (1991), 
characterizes the cloned Bamffl restriction modification system from Bacillus 
subtilis. The two genes are divergently oriented and separated by an open reading 
frame which may serve as a transcriptional regulator in the native bacteria. 
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Slatko, B.E., et al. Nucleic Acids Research 15:9781-9796 (1987), 
describe the cloning, sequencing and expression of the Taql restriction- 
modification system. These genes have the same transcriptional orientation, with 
the methylase gene 5 ' to the endonuclease gene. E. coli clones which carry only 
5 the endonuclease gene are viable even in the absence of the methylase gene. This 
is an unusual case possibly explained by the 65°C optimal temperature for Taql 
restriction and the 37°C optimal temperature for K coli growth. 

Howard, K.A., et al., Nucleic Acids Research 14:7939-7951 
(1986), describe the cloning of the Ddel restriction modification system from 

10 Desulfovibrio desulfitricans by a two step method wherein the methylase gene is 
first cloned and transformed into E. coli, followed by the cloning of the 
endonuclease gene and transformation of this second gene into the methylase- 
exprcssing bacteria. In order to maintain cell viability, high levels of methylase 
expression are required before the endonuclease gene can be introduced into the 

15 bacteria. 

Ito, H., et al.. Nucleic Acids Research 18:3903-3911 (1990), 
describe the cloning, nucleotide sequence and expression of the Hindi restriction- 
modification system. The DNA was isolated from H. influenzae Rc, with the two 
genes positioned in the same transcriptional orientation. 

Shields, S.L., et al., Virology 76:16-24 (1990), describe the 
cloning and sequencing of the cytosine methyltransferase gene M.CWJI from the 
Chlorella virus IL-3A. The methylase recognizes the sequence (G/A)GC(T/C/G) 
and shows amino acid sequence homology with 5-methylcytosine methylases 
isolated from bacteria. DNA encoding the methylase was obtained from the viral 
25 genome which was propagated in the green alga host Chlorella. 

Xia, Y., et al., Nucleic Acids Research 15:6075-6090 (1987), 
discovered that IL-3A virus infection of Chlorello-tike green alga induces the 
expression of the DNA restriction endonuclease CvOI which has novel sequence 
specificity. This endonuclease recognizes the sequence PuGCPy (wherein Pu = 
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purine and Py = pyrimidine) but does not cut the sequence PuG m CPy, where m C 
is 5-methylcytosine. 

U.S. Patent 5,137,823, issued August 11, 1992, to Brooks, J.E., 
describes a two step method for cloning the BamHl restriction modification 
system wherein the methylase is cloned first and then introduced into a bacterial 
host. The endonuclease is then cloned and introduced into the methylase 
expressing bacteria. This two step procedure provides the host DNA protection 
from cleavage of the subsequently introduced endonuclease. 

U.S. Patent 5,200,333, ('333) issued April 6, 1993, to Wilson, 
G.G., describes a method for cloning restriction and modification genes. 
Specifically this reference describes the cloning of the Taql and HaeU systems 
from Thermus aquaticus and Haemophilus aegypricus, respectively. In this 
method, bacterial DNA was initially purified and digested, and the fragments 
were then cloned into a vector to produce a bacterial DNA library. The library 
was then transformed into E. coli and the cells were plated. Colonies were then 
scraped from the plate to form a primary cell library. Plasmid DNA from this 
cell library was purified and digested with the endonuclease of the two gene 
system. Bacteria which expressed the methylase gene had modified plasmid DNA 
which was protected from endonuclease activity, while plasmids from bacteria 
which lacked the intact methylase gene were digested. The resulting, undigested 
plasmid DNA was then transformed into another bacterial strain and the bacteria 
were plated. Surviving colonies were again harvested to give a secondary cell 
library and the entire procedure repeated. Plasmids which code for the complete 
restriction-modification system presumably survived each round of purification 
and were enriched. Bacteria which survive several rounds of enrichment were 
subsequently assayed for both methylase and endonuclease activity. 

U.S. Patent 5,196,331, ('331) issued March 23, 1993, to Wilson, 
G.G. and Nwanko, D., describes a method for cloning the Mspl restriction and 
modification genes. This patent describes a method identical to that of U.S. 
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Patent 5,200,333 ('333). '331 is a continuation-in-part of, and '333 is a 
continuation of U.S.S.N. 707,079 (now abandoned). 

As mentioned above, Chlorella vims IL-3A encodes a unique 
restriction endonuclease called CwJI (Xia et al. Nucleic Acids Res. 15:6075-6090 
(1987)). IL-3A is a large, polyhedral, plaque-forming phycodnavirus (Francki, 
R.I.B., et al. Arch. Virol, suppl.2. Springer-Verlag, Vienna (1991)) that replicates 
in unicellular, eukaryotic green algae, Chlorella strain NC64A (Schuster, A.M., 
et al. Virology 150:170-177 (1986)). The double-stranded DNA genome of IL-3A 
is approximately 330 kbp (Rohozinski et al., Virology 168:363-369 (1989)) and 
contains 9.7% methylated cytidine (Van Etten, J.L. et al. Nucleic Acids Res. 
13:3471-3478 (1985)). The cognate methyltransferase of CvOl, M.CV/JI, 
methylates (A/G)GC(T/C/G) sequences and, has been cloned and sequenced 
(Shields, S.L. et al.. Virology 176:16-24 (1990)). 

The use of a two/three base recognition endonuclease, such as 
CWJI, to improve numerous conventional molecular biology applications as well 
as permitting novel applications has been described in co-pending U.S. Patent 
Application Ser.No. 08/036,481, filed on March 24, 1993. The application 
discloses methods for generating sequence-specific oligonucleotides from DNA 
without prior knowledge of the nucleic acid sequence of such DNA, and to 
methods for cloning and labeling DNA after restriction digestion by a two base 
recognition endonuclease. The application also teaches methods for generating 
quasi-random fragments of DNA, methods for cloning, labeling, and sequencing 
DNA, as well as epitope mapping of proteins. The ability to generate numerous 
oligonucleotides with perfect sequence specificity or quasi-random distributions 
25 of DNA fragments such as is possible with CvOf has important implications for 
a number of conventional and novel molecular biology procedures. 

Infection of Chlorella species NC64A with the IL-3A virus 
produces sufficient CWJI restriction endonuclease (CWJI) for research purposes. 
However, production of commercially useful amounts of CWJI is limited with this 
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system due to the slow growth of Chlorella algae, the large number of 
contaminating nucleases associated with the virus, and the small yield of enzyme 
obtained after purification. In addition, biochemical and biophysical 
characterization of the enzyme, such as molecular weight determination, are 
difficult from the native source. Because of these limitations it would be useful 
to clone the gene for CwJI in order to provide an adequate large scale source of 
enzyme for use as a molecular biological reagent. 

SUMMARY OF T HE INVENTION 

In one of its aspects, the present invention provides purified and 
isolated polynucleotides (e.g., DNA sequences and RNA transcripts thereof) 
encoding a unique restriction endonuclease, CwJI, as well as polypeptides and 
variants thereof which display activities characteristic of CwJI. Activities of CwJI 
include the recognition of specific DNA sequences, binding to these sequences 
and cleaving the bound DNA into fragments. Preferred DNA sequences of the 
invention include viral genomic sequences as well as wholly or partially 
chemically synthesized DNA sequences. Replicas (i.e., copies of the isolated 
DNA sequences made in vivo or in vitro) of DNA sequences of the invention are 
also contemplated. A preferred DNA sequence is set forth in SEQ ID NO: 2 
herein and is contained as an insert in the plasmid pCJH1.4. In another of its 
aspects, the invention provides purified isolated DNA encoding a CwJI 
polypeptide by means of degenerate codons. 

Also provided are autonomously replicating recombinant 
constructions such as plasmid DNA vectors incorporating CwJI sequences and 
especially vectors wherein DNA encoding CwJI or a CwJI variant is operatively 
linked to an endogenous or exogenous expression control DNA sequence. 

According to another aspect of the invention, host cells such as 
prokaryotic and eukaryotic cells, are stably transformed with DNA sequences of 
the invention in a manner allowing the desired polypeptides to be expressed 
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therein. Host cells expressing CVz'Jl and Cv/JI variant products are useful in 
methods for the large scale production of CvOl and CvUl variants wherein the 
cells are grown in a suitable culture medium and the desired polypeptide products 
are isolated from the host cells or from the medium in which the cells are grown. 
A preferred host cell is E. coli. Still another aspect of the invention is a 
recombinant CvlTL polypeptide. 

The present invention is also directed to a method for the digestion 
of DNA with a restriction endonuclease reagent under conditions wherein said 
DNA is cleaved at a dinucleotide sequence selected from the group consisting of 
PyGCPy, PuGCPy, PuGCPu, and wherein Pu = purine and Py = pyrimidine. 

The present invention is also directed to a method for restriction 
endonuclease digestion of DNA comprising the step of digesting DNA with a 
restriction endonuclease reagent under conditions wherein said DNA is digested 
at 11 of 16 possible dinucleotide sequences and wherein said dinucleotide 
sequences are selected from the group consisting of PuCGPu, PuCGPy, and 
PyCGPu, and wherein Pu = purine and Py » pyrimidine. 

The present invention is directed to shotgun cloning of DNA, 
epitope mapping, and for labeling DNA using the digestion methods of the present 
invention. The present invention provides methods for quasi-random fragmenting 
of DNA using the digestion methods of the present invention under conditions 
wherein the DNA is only partially cleaved and the site preference of the 
restriction endonuclease reagent is greatly reduced. By quasi-random is meant an 
overlapping population of DNA fragments produced by digesting DNA using the 
methods of the present inventions without apparent site-preference and which 
appears as a smear upon electrophoresis in a 1-2 wt. % agarose gel. The present 
invention is also directed to the shotgun cloning and sequencing of quasi-random 
fragments of DNA produced by the methods of the present invention. Quasi- 
random fragments in the shotgun cloning method of the present invention are 
produced by partial digestion of DNA with a restriction endonuclease reagent 
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according to the methods of the present invention. More particularly, quasi- 
random fragments of DNA useful in tne cloning method of the present invention 
are produced by the partial digestion of the DNA to be cloned with CviJ I, BsuR 
I or with a restriction endonuclease reagent termed CGase I comprising Taq I and 
Hpa n. Quasi-random fragments having a length of between about 100 and about 
10,000 nucleotides are preferred. More preferred are quasi-random fragments of 
about 500 to about 10,000 nucleotides in length. The present invention is also 
directed to the generation of quasi-random fragmentation of DNA using the 
method of the present invention for the purposes of epitope mapping and gene 
cloning. These quasi-random fragments are expressed either in vitro or in vivo 
and the smallest fragment containing the desired function is identified by 
screening assays well known in the art. 

The present invention is also directed to the production of 
anonymous primers from any DNA without prior knowledge of the nucleotide 
sequence. The present invention provides methods for anonymous primer cloning 
and sequencing after complete digestion of DNA utilizing CviJ I, BsuR I or 
CGase I using the methods of the present invention. 

Additionally, the present invention is directed to methods of 
labeling and detecting DNA comprising the complete digestion of DNA using the 
methods of the present invention, followed by a heat denaturation step, to yield 
sequence specific oligonucleotides. In particular, an aspect of the present 
invention involves labeling DNA with sequence specific oligonucleotides of about 
20 to about 200 bases in length (with an average size of between 20-60 bases) 
generated by CviJ I, BsuR I or CGase I digestion of the template DNA. 

More particularly, the invention is directed to restriction generated 
oligonucleotide labeling (RGOL) of DNA which comprises the digestion of an 
aliquot of template DNA with CviJ I followed by a simple heat denaturation step, 
thereby generating numerous sequence specific oligonucleotides, which can then 
be utilized for labeling nucleic acids by a number of methods, including primer 
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extension type reactions with a DNA polymerase and various labels, isotopic 
ornon-isotopic (RGOL-PEL); 5* end labeling with polynucleotide kinase: 3* end 
labeling using terminal transferase and various labels.isotopic or non-isotopic. 
Labeting at the 3' end, also referred to as tailing, adds numerous labels per 
oligonucleotide (1-200), depending on the labeling conditions. The addition of 
10-500 oligonucleotides generated per template, results in a significant signal 
amplification not obtainable by conventional methods. 

The invention is also directed to thermal cycle labeling (TCL) 
which comprises the simultaneous labeling and amplification of probes utilizing 
CviJ I or CGase I restriction generated oligonucleotides as the starting material. 
In this method, natural DNA of unknown sequence is digested with CviJ I to 
generate numerous double-stranded fragments which are then heat denatured to 
yield oligonucleotides. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of denaturation, annealing, and 
15 extension in the presence of a thermostable DNA polymerase or functional 
fragment thereof which maintains polymerase activity, deoxynucleotide 
triphosphates and the appropriate buffer. Alpha 32 P-dATP (or any of the other 
three deoxynucleotide triphosphates), biotin-dUTP, fluorescein -dUTP, or 
digoxigenin-dUTP is incorporated during the extension step for subsequent 
20 detection purposes. Thermal cycle labeling efficiently labels DNA while 
simultaneously amplifying large amounts of the labeled probe. In addition, TCL 
probes exhibit a 10 fold improvement in detection sensitivity compared to 
conventional probes. 

The present invention is also directed to TCL in which the 
thermostable DNA polymerase suppties endogenous primers for enzymatic 
extension. This method is referred to as Universal Thermal Cycle Labeling 
(UTCL). In this method natural DNA of unknown sequence is combined intact 
with the holo-enzyme of a thermostable DNA polymerase, deoxyribonucleotide 
triphosphates, and the appropriate buffer. The holo-enzyme and its associated 
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endogenous primers are then combined with intact template and subjected to 
repeated cycles of denaturation annealing and extension. Alpha ^P-dATP, 32p_ 
dTTP, 32 P-dGTP, 32 P-dCTP, biotin^lUTP, fluorescein-dUTP, or digoxigenin- 
dUTP is also included in the extension step for subsequent detection purposes. 
Isotopic labels useful in the practice of the present invention include but are not 
limited to 32 P, 33 P, 14 C and 3 H. Non-isotopic labels useful in the present 
invention include but are not limited to fluorescein biotin, dinitrophenol and 
digoxigenin. 

The present invention is also directed to an improved method for 
purifying CviJ I from the algae Chlorella infected with the virus IL-3A. 

In addition the present invention is directed to restriction 
endonuclease reagents which, under conditions which relax the sequence 
specificity of one or more restriction endonucleases, cleave DNA at the 
dinucleotide sequences AT or TA. 

The present invention is also directed to a restriction endonuclease 
reagent comprising in combination, Taq I and Hpa n, which is capable of 
digesting DNA at 11 of 16 possible dinucleotide sequences, said sequences 
selected from the group consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, 
and wherein Pu = purine and Py = pyrimidine. 

The following examples are intended to be illustrative of the several 
aspects of the present invention and are not intended in any way to limit the scope 
of any aspect of the present invention. 

BRIEF DESCRIPTION OF THE DRAWmSS 
Figure 1 is a map of the plasmid p710 which contains DNA 
sequences encoding for the IL-3A viral methyltransferase M.CwJI; 

Figure 2 is the nucleotide sequence of 5497 bp of cloned IL-3A 

viral DNA; 
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Figure 3 is a restriction map of the cloned IL-3A viral DNA, 
including the identified open reading frames; 

Figure 4 is the DNA sequence of the CvOl gene with its flanking 
regions. The predicted amino acid sequence is provided below the nucleotide 
sequences; 

Figure 5A depicts the theoretical frequency and distribution of 
CvOl restriction generated oligomers of individual lengths; Figure 5B shows the 
actual frequency and distribution of Cm* restriction generated oligomers of 
various lengths; 

Figure 6 is a flow chart depicting anonymous primer cloning; 
Figure 7 is a photographic reproduction of a gel depicting CWJI 
restriction digests of pUC19; 

Figure 8 is a photographic reproduction of a gel depicting 
comparisons of sonicated versus CWJI* partially digested DNAs; 

Figure 9A is a photographic reproduction of an agarose gel 
electrophoresis analysis of size-fractionated DNA by microcolumn 
chromatography compared to fractionation by agarose gel electroelution; 

Figure 9B-E illustrates additional trials of the same procedures 
used in Figure 9A; 

Figure 10A illustrates the size distribution of DNA fragments 
produced by partial digestion of DNA by CvOl and fractionated by microcolumn 
chromatography; 

Figure 10B-C illustrates the size distribution of DNA fragments 
produced by partial digestion of DNA by CviJI and fractionated by agarose gel 
25 electrophoresis; 

Figure 1 1 is a schematic depiction of the distribution of CvOl sites 
in pUC19; and 

Figure 12 is a graph of the rate of sequence accumulation by 
CwJI shotgun cloning and sequencing. 
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DETAILED DES( 
The gene for the restriction endonuclease R.CWJI was cloned into 
£. coli so as to provide an adequate source of R.CViJI for use as a molecular 
biological reagent. Biologically active CWJI has been purified from E.coli to 
apparent homogeneity. The molecular weight of E.coli derived R.CVtfI is 32.5 
kD by SDS gel electrophoresis. N-terminal amino acid sequence analysis of this 
protein and comparison to the nucleotide sequence of the gene revealed that the 
translation of this enzyme is probably initiated with a GTG start codon, instead 
of the usual ATG initiation codon. The structural gene is 834 nucleotides in 
length coding for a protein of 278 amino acids (31.6 kD). A second peak of 
R.CVJI activity which elutes separately from the 32.5 kD form can be seen in the 
initial stages of enzyme purification. Trace amounts of a larger molecular weight 
form have not been observed to date. However, the R.CWJI gene does possess 
an in-frame upstream ATG codon which if translated would yield a predicted 41.4 
15 kD protein. The structural gene for this potentially larger product is 1074 
nucleotides in length coding for a putative protein of 358 amino acids. 

The present invention is also directed to a method for the 
fragmentation and cloning of DNA using the restriction endonuclease CviJ I under 
conditions which allow the enzyme to cleave DNA at the dinucleotide sequence 
20 GC. In addition, the present invention is also directed to the cloning of quasi- 
random fragments of DNA digested using the fragmentation method of the present 
invention. 

As an alternative to the methods for constructing random clone 
libraries described above, methods were devised for the construction of such 
25 libraries which require fewer steps and reagents, which require smaller amounts 
of DNA, which have relatively high cloning efficiencies and which takes less time 
to complete. These methods relate to the recognition that a partial digest with a 
two or three base recognition endonuclease cleaves DNA frequently enough to be 
functionally random with respect to the rate at which sequence data may be 
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accumulated from a shotgun clone bank. The restriction enzyme CviJ I normally 
recognizes the sequence PuGCPy and cleaves between the G and C to leave blunt 
ends (Xia et at., Nucl. Acids Res. 15:6075-6090 (1987)). Under "relaxed" 
conditions (in the presence of 1 mM ATP and 20 MM DTT) the specificity of 
CviJ I can be altered to cleave DNA more frequently and perhaps as frequently 
as at every GC. This activity is referred to as CviJ I*. Because of the high 
frequency of the dinucleotide GC in all DNA (16 bp average fragment size for 
random DNA), quasi-random libraries may be constructed by partial digestion of 
DNA with CviJ J*. A DNA degradation method with low levels of sequence 
specificity produces a smear of the target DNA when analyzed by agarose gel 
electrophoresis. Digestion of the plasmid pUC19 under partial CviJ I* conditions 
does not result in a non-discrete smear; rather, a number of discrete bands are 
found superimposed upon a light background of smearing, suggesting that CviJ 
I has some site preference. Atypical reaction conditions according to the present 
invention eliminate this apparent site preference of CviJ I* to produce an activity 
(termed CviJ I**) in combination with a rapid gel filtration size exclusion step, 
streamlines a number of aspects involved in shotgun cloning. 

One aspect of the present invention involves the use of the 
two/three base recognition endonuclease CviJ I, in conjunction with a simple spin- 
column method to produce libraries equivalent in final form to those generated by 
the combination of sonication and agarose gel electroelution. However, the 

i 

method of the present invention requires fewer steps, a shorter time period, and 
significantly less substrate (nanogram amounts) when compared to conventional 
procedures. Both small and large sequencing projects using the methods 
described herein are within the scope of the present invention. 

Current sequencing paradigms require the generation of a new 
template for each 350-500 nucleotides sequenced. On this basis, sequencing both 
strands of the human genome would require at least 12 million templates 500 
nucleotides long, assuming no overlap between templates. 
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A random approach, such as shotgun sequencing, would require 30 
to 50 million templates, assuming the entire genome were randomly subclone! 
As many as 250,000 libraries may be needed to generate the requisite templates 
from a subcloned and ordered array of this genome, depending on the type of 
vector utilized, and the degree of overlap between such clones. The ability to 
generate shotgun libraries in a semi-automated, microtiter plate format would 
greatly simplify such large scale projects. 

The development of methods for cloning large DNA molecules in 
yeast artificial chromosomes (Burke et al. Science 236:806-812 (1987), or in 
bacteriophage Pl-derived vectors (Sternberg, Proc. Nail Acad. ScL USA 87: 103- 
107 (1990)), simplifies the subdivision and analysis of very large genomes. 
However, the large size of the resulting subclones (100 - 1000 kbp) presents 
additional challenges for subsequent sequencing efforts. A report of the 
sequencing of a 134 kbp genome by random shotgun cloning directly into a 
bacteriophage M13 vector indicates that numerous intermediate stages of 
subcloning, mapping, and overlapping such clones may be eliminated (Davison, 
J. DNA Seq. and Mapping 1:389-394 (1992). An order of magnitude reduction 
in the amount of DNA required for shotgun cloning would substantially simplify 
efforts to directly sequence 100,000 bp sized molecules and beyond. 

The ability to generate an overlapping population of randomly 
fragmented DNA molecules is considered essential for minimizing the closure of 
nucleotide sequence gaps by the shotgun cloning method. The use of a very 
frequent-cutting restriction enzyme, such as CvLT I, is an approach which has not 
been utilized. Reaction conditions according to the present invention result in the 
quasirandom restriction of pUC19 and lambda DNA, as judged by the degree of 
smearing observed. 

The randomness of this CvU I** reaction was quantified by 
sequence analysis of 76 such partially-fragmented pUC19 subclones. The analysis 
is showed that CvU I** partial digestion (limiting enzyme and time) restricts DNA 
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at PyGCPy, PuGCPu, and PuGCPy (but not PyGCPu), and is thus a hybrid 
reaction which combines the three base recognition specifity of CviJ I with the 
"two" base recognition specifity of CviJ I*. Interestingly, most of the "relaxed" 
cleavage observed under CviJ I** conditions occurred in those portions of the 
sequence which were deficient in "normal" restriction sites. CviJ I** treatment 
produces a relatively uniform size distribution of DNA fragments, permitting 
sequence information to be accumulated in a statistically random fashion. 

Shotgun cloning with CviJ I** digested DNA is efficient partly 
because the resulting fragments are blunt ended. Other methods currently used 
to randomly-fragment DNA, including sonication, DNAse I treatment, and low 
pressure shearing, leave ragged ends which must be converted to blunt ends for 
efficient vector ligation. Other than a heat denaturation step to inactivate the 
endonuclease, no additional treatments are required for cloning CviJ I** restricted 
DNA. In addition, the preligation step required to equalize representation of the 
15 ends of a DNA molecule prior to sonication or DNAse I treatment is not 
necessary with CviJ I** fragmentation. CviJ I* cleaves its cognate recognition 
site very close to the ends of a linear molecule, as judged by the very small 
fragments resulting from complete digestion of pUC19 as depicted in Figure 2, 



10 



20 



lane 1. 



The overall efficiency of shotgun cloning depends not only on the 
fragmentation process, but also upon the size fractionation procedure used to 
remove small DNA fragments. The efficiency of cloning agarose gel fractionated 
DNA was found to be unexpectedly variable. Numerous experiments produced 
an erratic distribution of sized material and the resulting cloned inserts were 
25 uniformly small (70% < 500 bp in one trial, 10096 < 500 bp in another). The 
method of the present invention includes a simple and rapid micro-column 
fractionation method, which has resulted in three to thirteen times more 
transformants than agarose gel fractionation. More importantly, the size 
distribution of the cloned inserts from column-fractionated DNA was skewed 
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toward larger fragments (88% > 500 bp). Micro-column fractionation also 
eliminates the chemical extraction steps required for agarose fractionated DNA. 
After the target DNA has been column-fractionated, no further treatments are 
required for cloning. Combining CviJ I** partial restriction with micro-column 
5 fractionation permits the construction of useful libraries from as little as 200 ng 
of substrate, an order of magnitude less starting material than recommended for 
sonication/end-repair and agarose gel fractionation procedures. 

The CviJ 1 reaction represents a unique alternative for controlling 
the partial digestion of DNA, a technique which is fundamental to the construction 
10 of genomic libraries (Maniatis et al. Cell 15:687-701 (1978), and restriction site 
mapping of recombinant clones (Smith, et al. Nucl. Acids Res. 3:2387-2398 
(1976). Partial DNA digests are notably variable and are strongly dependent on 
the concentration and purity of the DNA, the amount of enzyme used, the 
incubation time, and the batch of enzyme. Partial digestions may also be variable 
with respect to the rate at which a particular recognition sequence is cleaved 
throughout the substrate. Optimal reaction conditions, such as those which render 
such partial digests independent of one or more of these variables, allows more 
precise control of the end product. Several controlling schemes may be 
employed, including: the addition of a constant amount of carrier DNA (Kohara 
20 eial, Cc//50:495-508(1987)),theuseoflimitingamountsofMg 2+ (Albertson 
et al. Nucl. Acids Res. 17:808 (1989)), ultraviolet irradiation (Whitaker, et al. 
Gene 41:129-134), and the combination of a restriction enzyme and a sequence 
complementary DNA methylase (Hoheisel et al, Nucl. Acids Res. 17:9571-9582 
(1989)). Utilizing three different batches of CviJ I, and three different DNA 
25 templates from five separate preparations, a uniform CviJ I** partial digestion 
pattern was obtained that was primarily time-dependent when a constant ratio of 
0.3 units of enzyme per ug of DNA was used. 

The rate at which a particular restriction site is cleaved at different 
locations in a substrate is variable for many endonucleases (Brooks, et al., 
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Methods in Enzymol, 152:113-129 (1987)). Reaction conditions for CviJ I may 
be optimized to substantially reduce the site preferences of this enzyme during 
partial digestion (see Figure 2, lanes 3 and 4). Normally, -star" reaction 
conditions result in cleavage at new sites. The use of star reaction conditions 
5 according to the present invention (dimethyl sulfoxide [DMSO] and lowered ionic 
strength) to affect the partial digestion activity of CviJ I** does not result in an 
altered restriction site cleavage as assayed by sequencing the products of 76 
digestion reactions. Instead, the relative rate of cleavage of individual sites 
appears to be more uniform under these conditions. A 3-5 fold increase in the 

10 rate of normal CviJ I restriction with the standard buffer and DMSO further 
substantiates this approach. All of these results indicate that, under the 
appropriate reaction conditions, CviJ I is useful for a number of other 
applications, such as high resolution restriction mapping and fingerprinting, 
diagnostic restriction of small PCR fragments, and construction of genomic DNA 

15 libraries. 

Another aspect of the present invention involves quasi-random 
fragmentation of DNA using the method of the present invention for epitope 
mapping and cloning intact genes. The same method as described above for 
shotgun cloning is utilized, except that an expression vector is used to generate 

20 functional proteins from the DNA. 

Another aspect of the present invention involves fragmenting DNA 
using the present invention to generate multiple oligonucleotides from any double- 
stranded DNA template. Restriction-generated oligonucleotides (RGO) are 
sequence specific oligonucleotides generated from any DNA according to the 

25 present invention. CviJ I* presumably cleaves the recognition sequence GC 
between the G and C to leave blunt ends (Xia et al., NucL Acids Res. 15:6075- 
6090, (1987)). Because of the high frequency of dinucleotide GC in all DNA 
(16bp average fragment size for random DNA), a complete CviJ I* restriction 
results in numerous fragments which are about 20-200 bp in size. These 
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restriction fragments are generated from an aliquot of the template itself and are 
heat-denatured to yield numerous single-stranded oligonucleotides which are of 
variable length but which are specific for the cognate template. Complete CviJ 
I restriction of the small plasmid pUC19 (2689 bp) theoretically yields 314 
oligonucleotides after a heat-denaturation step. The ability to generate numerous 
oligonucleotides with perfect sequence specificity is an unusual result of the use 
of this class of enzyme according to the present invention. Such oligonucleotides 
are uniquely suited for purposes of labeling DNA, as described below. 

One application of CviJ I* restriction-generated oligonucleotides is 
to directly label them using conventional methods. There are several important 
advantages in using CviJ I* restriction-generated oligonucleotides. Conventional 
methods employing synthetic oligonucleotides for detection purposes generally use 
one oligonucleotide containing one or a few labels. A complete CviJ I* digest 
generates hundreds of oligonucleotides from a given template, depending on the 
size of the template, and thus makes hundreds of sites available for labeling, 
regardless of the labeling scheme utilized. These hundreds of sequence specific 
restriction-generated oligonucleotides have two important advantages over 
conventional probes used in nucleic acid detection methods. First, the generation 
of multiple oligonucleotide probes directed at multiple sites in a given target 
(theoretically, 314 sites in pUC19) provides enhanced detection sensitivities 
compared to synthetic oligonucleotides which are directed at 1 or a few sites in 
a target. The numerous labeled restriction-generated oligonucleotides represent 
a 10-100 fold amplification of the signal for detection compared to the use of a 
single oligonucleotide. Second, the short length of the restriction-generated 
oligonucleotides permits more efficient hybridization. This is important for two 
reasons. First, hybridization times using restriction-generated oligonucleotides is 
reduced to 1 hr as opposed to an overnight incubation with conventional probes 
hundreds of nucleotides in length. This is a very important advantage when using 
oligonucleotide probes in clinical settings. Second, the penetration of probes into 
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permeabilized cells is a critical issue for in situ hybridization procedures. The 
smaller the probe, the easier the entry into the cell. Thus, the use of multiple 
oligonucleotide probes generated by the two base cutters greatly improves the 
sensitivity of in situ hybridization, a technique of considerable importance in 
research and clinical labs. Finally, when using membrane-based hybridization 
procedures, only small sections of a target nucleic acid are exposed and available 
for hybridization. Multiple oligonucleotides derived from a cognate template 
exhibit better detection sensitivities compared to long probes. 

Another application of restriction-generated oligonucleotides for 
labeling is to employ them as primers in a polymerase extension labeling reaction 
in conjunction with a repetitive thermal cycling regimen of denaturation, 
annealing, and extension. Thermal Cycle Labeling (TCL) is a method for 
efficiently labeling double-stranded DNA while simultaneously amplifying large 
amounts of the labeled probe. The TCL system employs the two base recognition 
endonuclease CviJ I* to generate sequence-specific oligonucleotides from the 
template DNA itself. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of denaturation, annealing, and 
extension by a thermostable DNA polymerase from, for example, Thermusflavus. 
A radioactive- or non-isotopically-labeled deoxynucleotide triphosphate is 
incorporated during the extension step for subsequent detection purposes. The 
amplified, labeled probes represent a very heterogeneous mixture of fragments, 
which appears as a large molecular weight smear when analyzed by agarose gel 
electrophoresis. Primer-primer amplification, a side product of this reaction 
(produced by leaving out the intact template in the TCL reaction), may result in 
enhanced detection sensitivity, perhaps by forming branched structures. Biotin- 
labeled probes generated by the TCL protocol detect as little as 25 zeptomoles 
(2.5 x 10- 20 moles) of a target sequence. A 50 M l TCL reaction yields as much 
as 25 M g of labeled DNA, enough to probe 25 to 50 Southern blots. After 20 
cycles of denaturation and extension, biotin-dUTP-incorporated TCL probes may 
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be routinely detected at a 1:10 6 dilution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
the probe is occurring. In addition, non-isotopically-labeled TCL probes exhibit 
a 10-fold improvement in detection sensitivity when compared to RPL-generated 
probes. J P-labeled probes generated by the TCL protocol may also detect as 
little as 50 zeptomoles (2.5 xlO* 20 moles) of a target sequence. As little as 10 
pg of template DNA is enough to synthesize 5-10 ng of radioactive version of 
TCL generates probes having extremely high specific activities, e.g. (about 5 x 
10 9 cpm//xg DNA), which permits 5 to 10-fold lower detection limits than 
conventional labeling protocols. 

There are several advantages to using restriction-generated 
oligonucleotides for primer extension labeling of DNA. One advantage is the 
specificity of the primers. All of the oligonucleotides generated by the TCL 
system are specific for the template utilized, unlike random primer labeling (RPL) 
which utilizes synthetic oligonucleotides 6-9 bases in length having a random 
sequence. The amount of primer required for efficient labeling with the TCL 
system is only 10 ng, compared to the 10 /*g of random primers utilized for RPL. 
Due to their short length, random primers anneal very inefficiently above 25- 
37°C, thus RPL is limited to DNA polymerases such as Klenow or T7. The size 
of the restriction-generated oligonucleotides are longer than the random primers, 
which extends the hybridization and extension conditions to include a wide variety 
of temperatures and polymerases. Thus, the use of the restriction-generated 
sequence-specific oligonucleotides results in more efficient hybridization and 
extension as compared to RPL. The TCL system has been optimized for labeling 
with a thermostable DNA polymerase which allows the option of temperature 
cycling. After 20 cycles of denaturation and extension, a significant amount of 
amplified TCL probes can be generated. Most importantly, TCL-labeled probes 
exhibit a 10 fold improvement in detections sensitivity when compared to RPL- 
generated probes. 
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Another aspect of the present invention involves a variation of TCL 
called Universal Thermal Cycle Labelling (UTCL) in which the extension primers 
are not supplied by CvUI restriction, but rather, are found endogenously in the 
enzyme preparations of thermostable DNA polymerases. Random sequence DNA 
is usually co-purified along with the holo-enzyme preparation of the thermostable 
DNA polymerases, regardless of the source of the enzyme, i.e. native or cloned. 
However, only the holo-enzyme, and not the exonuclease minus deletion variants, 
contain the endogenous DNA. Typically, when the holo-enzymes of thermostable 
polymerases are used in protocols such as the polymerase chain reaction, the 
presence of such primers can create spurious results. Methods for circumventing 
the problems of endogenous DNA are described in PCR Protocols: A Guide to 
Methods and Applications, Eds. M. Innis, tt al., Academic Press, 1990. 

This residual DNA is rather short (approximately 5-25 bases), as 
assayed by end-labeling with ^Pf ATP] and polynucleotide kinase and acts as 
endogenous "random" primers in a TCL-type reaction. UTCL combines the holo- 
enzyme of a thermostable polymerase from, for example, Thermits flavus, with 
the intact DNA template and is subjected to repeated cycles of denaturation, 
annealing, and extension. A radioactive- or non-isotopkally-labeled 
deoxynucleotide triphosphate is incorporated during the extension step for 
subsequent detection purposes. The amplified, labeled probe represents a very 
heterogenous mixture of fragments, which appears as a large molecular weight 
smear when analyzed by agarose gel electrophoresis. Biotin-labeled probes 
generated by the UTCL protocol detect as little as 25 zeptomoles (2.5 x 1(T 20 
moles) of a target sequence. A 15 ^1 UTCL reaction yields as much as 5-10 M g 
of labeled DNA, enough to probe 5 to 10 Southern blots. After 20 cycles of 
denaturation and extension, biotin-dUTP-incorporated UTCL probes may be 
routinely detected at a 1:10 6 dilution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
the probe is occurring. In addition, non-isotopically-labeled UTCL probes exhibit 
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a 10-fold improvement in detection sensitivity when compared to RPL-generated 
probes. 32 P-labeled probes generated by the UTCL protocol may also detect as 
little as 50 zeptomoles (2.5 xlO" 20 moles) of a target sequence. The radioactive 
version of UTCL generates probes having extremely high specific activities, e.g. 
(about 5 x 10 9 cpm//tg DNA), which permits 5 to 10-fold lower detection limits 
than conventional labeling protocols. 

The present invention is illustrated by the following examples 
relating to the isolation of a full length viral DNA clone encoding R.CviJI, to the 
expression of R.CviJI DNA in E.coli strain DHSaF'MCR and to purification of 
R.CWJI from this bacterial stain. More particularly, Example 1 provides for the 
propagation of JL-3A virus and isolation of viral genomic DNA. Example 2 
addresses the improved expression of a clone for the viral methylase M.CVUI . 
Example 3 describes the strategy for isolating and cloning the viral R.CVOI gene 
by a forced co-cloning strategy of the M.CwJI gene. Example 4 describes the 
sequencing of cloned JL-3A genomic DNA and identification of the RCviJI gene. 
Example 5 relates the methods for purification of CvOl to homogeneity from an 
Kcott strain, DH5orF 'MCR, transformed with a plasmid which encodes the 
R. CV/JI enzyme. Example 6 details the amino acid sequence analysis of the 
purified RCviJI enzyme. Example 7 describes the analysis of CViJI* recognition 
sequences. Example 8 relates to a technique for producing restriction generated 
oligonucleotides using CvOI. Example 9 relates the generation of anonymous 
primers using CV/JI. Example 10 describes end-labeling of CV/JI restriction 
generated oligonucleotides. Example 11 describes primer extension labeling of 
DNA using restriction generated oligonucleotides. Example 12 relates the use of 
CvOl in thermal cycle labeling of DNA as well as the method of universal thermal 
cycle labelling. Example 13 provides a method for generation of quasi-random 
DNA fragments using CvOl. Example 14 describes fractionation of CvOl digested 
DNA by size using spin column chromatography. Example 15 details the relative 
cloning efficiency of Cvfll digested, size-fractionated DNA by gel elution and 
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chromatographic methods. Example 16 describes the comparison of cloning 
efficiency using lambda DNA fragmented by both sonication and CwJI** 
techniques. Example 17 details the use of CviJl** fragmentation for shotgun 
cloning and sequencing. Example 18 describes the shotgun cloning of lambda 
DNA using CWJI. Example 19 describes the use of CvOl in epitope mapping 
techniques. Example 20 describes the restriction endonuclease reagent CGasc I. 

Example 1 
Propagation of IL-3A Virus 

The exsymbiotic Chlorella-like alga, NC64A, originally isolated 
from Paramecium bursaria (Karakashian, S.J. and Karakashian, M.W., Evolution 
and Symbiosis in the Genus Chlorella and Related Algae. Evolution 19:368-377 
(1965)), was grown and maintained in Bold's basal medium (BBM), (Nichols, 
H.W. and Bold, H.C. J. Phycol. 1:34-38 (1965)) modified by the addition of 
0.5% sucrose, 0.1% protease peptone, and 20 /xg/ml tetracycline (MBBM). 
Cultures were innoculated with 1 X 10 6 algae cells/ml and grown at 25°C in 250 
ml of MBBM in 500 ml Erlenmeyer flasks on a rotary shaker (150 rpm) in 
continuous light (ca. 30 M H, m^.sec" 1 ). Growth was monitored by light 
scattering measured as Ag^ and/or by direct cell counts with a 
hemocytometer. 

When the cultures reached approximately 1 X 10 7 algae cells/ml 
they were innoculated with filter sterilized (0.4 urn nitrocellulose filter, 
Nucleopore, Pleasanton, CaUfornia) IL-3A virus at a multiplicity of infection of 
0.01 and incubated for an additional 48 - 72 hours at 25°C. The crude lysate was 
then cemrifuged at 3000 rpm (2000 xg) for 10 minutes to remove cellular debris. 
Nonidet P-40 was then added to 1% (v/v) and the virus was pelleted from the 
supernatant by centrifuging at 15,000 rpm at 4°C for 75 minutes in a Beckman 
No. 30 rotor. The viral pellet was gently resuspended in 0.05 M Tris-HCl, pH 
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7.8, and the sample was layered on linear 10 - 40% sucrose gradients equilibrated 
with 0.05 M Tris-HCl, pH 7.8, and centrifuged for 20 minutes at 20,000 rpm at 
4°C in a Beckman SW28 rotor. The viral band, which was present in the center 
of the gradient as an opaque band, was removed, diluted with 0.05 M Tris-HCl, 
pH 7.8, and pelleted by centrifugarion at 15,000 rpm at 4°C for 120 minutes in 
a Beckman No. 80 rotor. The virus was resuspended in a small volume (10ml) 
of 0.05 M Tris-HCl, pH 7.8, and stored at 4°C. 

IL-3A viral DNA was purified from the viral particles using a 
modification of the protocol described by (Miller, S.A., Dykes, D.D., and 
Polesky, H.I., Nucleic Acids Res. 16:1215 (1988)). Briefly, 100 ^1 of IL-3A 
virus (9.8 X 10 1 1 plaque forming units/ml) was diluted with 400 fd of water and 
then mixed with 10 M l TEN (0.5 M Tris-HCl, pH 9.0, 20 mM EDTA, 10 mM 
NaCl) and 10 m1 of 10% SDS. After incubating at 70°C for 30 minutes the 
solution was extracted twice with phenol-chloroform-isoamyl alcohol, extracted 
once with chloroform, and precipitated with ice-cold ethanol using methods well 
known in the art and resuspended in 500 ^1 of H 2 0. (Ausubel, F.M. , Brent, R., 
Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A. and Struhl, K. (Eds.) 
(1987) Current Protocols in Molecular Biology, Wiley, New York; Sambrook, J., 
Fritsch, E.F. and Maniatis, T. (1989), Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). 

Example 2 
CviJI Methyltransferase Clone 

The CwJI methyltransferase gene (M. CviJI) from Chlorella virus 
IL-3A was cloned and sequenced by Shields a ol., Virology 176:16-24 (1990). 
Briefly, SauSA partial digest of Chlorella vims IL-3A was ligated to BamHl 
digested pUC19 and transformed into E. coll strain RR1. This library of plasmids 
was restricted with HindUl (AAGCTT) and Sstl (GAGCTC), both of which are 
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inhibited by 5-methylcytidine (5mC) in the AGCT portion of their recognition 
sequences, and transformed again into RR1 cells. M. CviJI methylates the internal 
cytidine in (G/A)GC(T/C/G) sequences. If the M.CWJI gene is cloned and 
expressed appropriately, the plasmid DNA would be expected to be resistant to 

5 Hindm and SstI restriction. 

The CvOl methyltransferase gene was originally cloned as a 7.2 kb 
insert, termed pIL-3A.22. Plasmid pIL-3A.22 was only partially resistant to CviSl 
digestion. Partial digestion is most likely due to the inefficient expression of the 
M.CViJI gene and the numerous CVtJI sites in both the vector (pUC19 has 45 

0 CwJI sites) and in the insert DNA. The M. CvOl gene was eventually sublocalized 
to a region of 3.7 kb by subcloning using methods well known in the ait 
(Ausubel, F.M. , Brent, R., Kingston, R.E. , Moore, D.D., Seidman, J.G. , Smith, 
J. A. and Struhl, K. (Eds.) (1987) Current Protocols in Molecular Biology, Wiley, 
New York; Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989), Molecular 

5 Cloning: A Laboratory Manual, Cold Spring Haibor Laboratory Press, Cold 
Spring Harbor, New York ) and testing the subcloned DNA for 
sensitivity/resistance to Hindm, SstI, and CvHI. (Shields et a/., supra) The 
entire sequence was determined and three open reading frames which could code 
for polypeptides 161, 367, and 162 amino acids, respectively, were identified. 

0 The 367 amino acid open reading frame (ORF) was identified as the M. CMJI gene 
by three criteria: (i) it is the only ORF located in the region identified by 
transposon mutagenesis; (ii) it has amino acid motifs similar to those of other 
cytosine methyltransferases; and (iii) a 1.6 kb Dral fragment containing the 367 
amino acid ORF (1101 bp) produces the methyltransferase. This 1.6 kb M.CWJI 

5 encoding fragment was subcloned into the EcoKV site of pBluescript KS(-) 
(Stratagene, LaJolla, CA), in the same translational orientation as the JocZ' gene 
of this vector. A physical map of the resulting plasmid termed p710 is shown in 
Figure 1. 
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The plasmid p710 was digested with several endonucleases to 
indirectly test the efficiency of M.CvOl expression. Fully active methylase should 
render the plasmid DNA completely resistant to digestion by the following 
enzymes: HaelO. (which recognizes the sequence GGCC), Sacl (which recognizes 
5 the sequence GAGCTC), and Hindm (which recognizes the sequence AAGCTT) . 
The plasmid was partially resistant to Haem (90%) and Sacl (90%), and even less 
resistant to Hindm (25%) digestion. This lack of complete protection of the 
plasmid DNA made it impractical to attempt cloning the three/two base restriction 
endonuclease encoded by the R.CMJI gene. Thus, improvements in the efficiency 

10 of M. CvOl expression were required before attempting to clone the R. CWJI gene. 

The translation efficiency of the M.CV£JI gene was improved by 
removing extraneous 5' open reading frames, creating a perfect fusion of the 
lacZ ' Shine-Delgarno sequence with the methyltransferase start codon (see Figure 
1). This was achieved by site-specific oligonucleotide mutagenesis, using the 

IS oligomer 

5 '-CAATTTCACACAGGAAACAGCTATGTCTTTTCGCACGTTAGAAC-3 ' 
(SEQ ID NO: 1) to precisely remove the intervening lacZ' DNA. The relevant 
DNA sequences are indicated in Figure 1 (SEQ ID NO: 12). The mutagenesis was 
facilitated by converting the double stranded plasmid DNA of p710 to single- 

20 stranded DNA by co-infecting the E. coli host strain with the helper phage R408 
(Russel, M., Kidd, S. and Kelly, M.R. Gene 45:333-338), using methods well 
known in the art. The mutagenesis reaction was completed using a commercially 
available kit according to the manufacturer's instruction (Mutagene, Bio-Rad, 
Hercules, California). The oligonucleotide was annealed to the single-stranded 

25 plasmid, extended in the presence of T4 DNA polymerase, ligated using T4 DNA 
ligase, and transformed into competent SURE*" cells (Stratagene, La Jolla, 
California). Transformed cells were then grown overnight as a pool, the DNA 
isolated and purified. 
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Enrichment for the mutagenized plasmids was made possible by 
virtue of the loss of an Xhol site located in the sequence that was deleted by 
mutagenesis. Enrichment was accomplished by digesting the isolated, purified 
plasmid DNA with Xhol, followed by dephosphorylation with calf intestinal 
alkaline phosphatase (CIAP), and transformed into SURE cells. Plasmid DNA 
was isolated from 18 individual colonies and the DNA tested for resistance to 
Xhol. Plasmid DNA from 1 1 colonies were resistant to Xhol digestion, indicating 
that they lacked the deleted sequence. Five of these plasmids were restricted with 
HaeUl, Hindm, PvuTL (which recognizes the sequence CAGCTG), and Cvtfl. All 
five appeared 100% resistant to these enzymes. Four of the plasmids were 
sequenced and the deletion was confirmed as being correct. One of these, 
pBMC5, was chosen for further modification. 

Example 3 
Forced Co-Cloning of R.CvfJI 

The location of the R.CVOI gene on the IL-3A virus genome was 
inferred as being 3' to the M.CvrJI gene for two reasons: 1) the cloned DNA 
sequence 5' to the M.CwJI gene did not produce a restriction activity; and 2) 
several attempts to clone the DNA 3' to the M.CviJI gene resulted in 
deletions/rearrangements of this downstream region. This information permitted 
a forced co-cloning strategy to obtain the restriction endonuclease gene. This 
strategy uses a deletion derivative of pBMC5 lacking the 3 ' half of the M.CVOI 
gene. Digestion of the IL-3A genome with the same enzyme used to create the 
M.CwJI deletion, followed by ligation of the respective DNAs, transformation, 
and digestion with enzymes incapable of recognizing methylated DNA (e.g., 
Haem, Hindm, PvuU, CV/JI, etc.) should force the selection of clones which 
have a restored M. CV/JI gene (and thus active methylase enzyme), as well as 
downstream DNA. Thus, if a clone is found to be CWJI resistant, the 3 ' half of 
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M.CvHl must have been restored, and downstream DNA containing the R.CwJI 
gene, at least in part, would presumably be cloned. 

The details of this cloning strategy are as follows. pBMC5 has two 
EcdKL sites, one approximately in the middle of the M.CvJI gene, while the other 
site lies in the vector DNA, 3 ' to the M.CwJI gene (see Figure 1). pBMC5 was 
restricted with EcoBI and ligated at a dilute concentration (10-50 ng/ M l) to favor 
circularization without the 3 ' M.CWJI fragment. The reaction mixture was then 
transformed into competent SURE cells and plated on TY agar containing 
ampicillin. Plasmid DNA from the resulting colonies was tested for the lack of 
this EcoW fragment by digestion with EcoRI. One of these clones, pBMCSRI, 
was used for the subsequent co-cloning work. Plasmid pBMC5RI was digested 
with £coRI and dephosphorylated using CIAP. JL-3A genomic DNA was then 
digested to completion with EcoRI. The EcoRI digested pBMCSRI and IL-3A 
DNAs were combined at a ratio of 1:3 in a ligation reaction using T4 DNA 
ligase, and the products of the ligation reaction were subsequently used to 
transform competent SURE cells. The pBMC5RI/IL-3A transformants were not 
plated, but rather grown overnight in culture as a library or pool of cells. The 
cells were harvested the next day and DNA was isolated and purified. Isolated, 
purified DNA was digested with HaeTR, dephosphorylated with CIAP, and 
transformed into competent SURE cells. The cells were then plated and grown 
overnight. Six colonies grew, of which only one containing the plasmid, 
pCJH1.4, was resistant to HaeUl. The plasmid pCJH1.4 was found to encode 
CwJI restriction activity. Plasmid pCJH1.4 was further characterized to localize 
the gene for CwJI by deletion analysis, subcloning experiments, and sequencing. 
The plasmid pCJH1.4 was deposited with the American Type Culture Collection 
on June 30, 1993 under Accession Number 69341. 
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Example 4 

Sequencing of Cloned EL-3A DNA Containing CviJI Gene 

The JEcoRI fragment cloned into pCJH1.4 (as described in Example 
3) is 4901 bp in length. Except for the 519 bp corresponding to the 3 m portion 
5 of the M.CWII gene, the remainder of the 4901 bp EcoR I fragment cloned into 
pCJH1.4 was sequenced using the SEQUAL DNA Sequencing System 
(CHIMERx, Madison, WI) by methods well known in the art. Sequencing was 
accomplished using three approaches: 1) primer walking on pCHJ1.4, 2) cloning 
various restriction endonuclease digests of pCHJl .4 into an M13 type sequencing 

10 vector; and 3) sequencing various restriction endonuclease deletion derivatives of 
pCHJl.4. The nucleotide sequence of 5497 bp of EL-3A viral DNA is shown in 
Figure 2 and set forth in SEQ ID NO.: 2. 

Six open reading frames (ORF) of 1155 bp (ORF1), 468 bp 
(ORF2), 555 bp (ORF3), 1086 bp (OKF4), 397 bp (ORF5) and 580 bp (ORF6) 

15 which could code for polypeptides containing 358 (41.4 kD), 156 (19.4 kD), 185 

(20.3 kD), 362 (38.9 kD), 132 (14.5 kD) and 193 (21.9 kD) amino acids, 
respectively, were identified (see Figure 3). ORFs 4-6 do not code for the 
R.CWJI gene, as the deletion derivative pCdA12, which lacks the DNA between 
the Aval and BamiU sites (see Figure 3), does produce CviSl restriction 

20 endonuclease activity. In addition, the deletion derivative pCdEB7, lacking the 
DNA between the EcdRI and BamHI sites, did not produce CviSl activity. Thus 
ORF1 or ORF3 were the most likely candidates for encoding the R.CWJI gene. 
The sequence of the 1155 bp ORF1 (SEQ ID NO: 3), its deduced amino acid 
sequence (SEQ ID NO: 4) (as shown in capital letters), plus flanking bases, is 

25 presented in Figure 4. The vertical line in Figure 4 and the associated arrow 
indicate where the DNA sequence from pJCH1.4 diverges from that of pEL- 
3A.22-8 (Shields, S.L., et al., Virology 76:16-24, 1990). This open reading 
(ORF1) frame is believed to represent the CwJI gene because 14 out of 15 N- 
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terminal amino acids from the protein sequence (see Example 6) matched the 
predicted translation product of the nucleic acid sequence (Figure 4). Also, the 
32.5 kD molecular weight of the homogeneously purified enzyme described in 
Example 5 matched the predicted translation product of the nucleic acid sequence 
(31.6 kD) if the encoded protein was translated beginning at the GTG codon 
located at nucleotides 299 - 301 (Figure 4), instead of the 5 * ATG codon located 
at nucleotides 59-61. This possibility is not surprising in light of the fact that 
approximately 1096 of prokaryotic and eukaryotic gene products begin translation 
with a GTG start codon, rather than the usual ATG codon (Kozak, M. , Microbiol. 
Rev. 47:1-45 (1983); Kozak, M. J.CeU.BioL 108:229 (1989); Gold, L. et al.. 
ATum.Rev.Microbiol. 35:365-403 (1981)). The structural gene was identified to 
be 834 nucleotides in length, coding for a protein of 278 amino acids (31.6 kD) 
and is set forth in SEQ ID NO: 4. It is also interesting to note that the CvUl gene 
was shown to possess an in-frame, upstream ATG codon which if translated could 
yield a protein with a predicted molecular weight of 41.4 kD (Figure 4). A larger 
molecular weight form possessing CvOI restriction activity has not been detected 
by SDS gel electrophoresis. However, a second peak of CvOI activity which 
eluted separately from the 32.5 kD form was detected in the initial stages of 
enzyme purification. The DNA sequence which could theoretically code for a 
larger form of CviJl would be approximately 1074 nucleotides in length (assuming 
it starts at the upstream ATG codon) and would code for a protein of 358 amino 
acids. 

Example 5 

Purification of Recombinant CvJI Restriction Endonuclease 

Initially, 20 ml of LB medium (plus 100 /ig/ml ampicillin) were 
inoculated with a 1 ml stock of E. coli transformed with the plasmid pCJH1.4 
described above and grown overnight at 37°C with shaking. The next day, 20 ml 
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of this initial overnight culture was used to inoculate another 1 liter of LB 
medium and grown overnight. The following day, 50 liters of TB medium (12 
g Bacto-Tryptone, 24 g Bacto Yeast Extract, 4 ml glycerol, 2.31 g KH 2 P0 4 , 
12.54 g K 2 HP0 4 , 0. 1 g MgS0 4 , 100 /xg/ml ampicillin, and water to 1 liter) were 
inoculated with an aliquot of the secondary overnight culture and grown at 37°C 
with 20 liters/min aeration at 200 RPM, until the OD 595nm reached 1.0 unit. 
Vigorous aeration was essential for CVz'JI expression and a typical yield contained 
70 g of cell paste after centrifugation. 

The cell pellet was immediately resuspended in lysis buffer A 
(30 mM Tris-HCl, pH 7.9 at 4°C, 2 mM EDTA, 10 mM beta-mercaptoethanol, 
50 ^g/ml phenylmethylsulfonyl fluoride (PMSF), 20 /tg/ml benzamidine, 2 /tg/ml 
0-phenantroline, 0.7 /ig/ml pepstatin) at a volume of 3 ml of buffer A per 1 g of 
cells. The cell suspension was then passed through a Manton-Gaulin cell 
disrupter (Gaulin Corporation, Everett, MA) twice and centrifuged for 1 hr (8000 
RPM, SorvaU GS3 Rotor) at 4°C. To the supernatant, solid NaCl was added to 
a final concentration of 200 mM, and 10% polyethyleneimine (PEI) solution 
slowly added to a final concentration of 1%. The mixture was stirred for 3 hr, 
and then centrifuged 30 min, at 4°C, 8000 RPM (SorvaU GS3 Rotor). Solid 
ammonium sulfate was then added to the supernatant at 0.5 g/ml and the mixture 
was stirred overnight at 4°C. The precipitated proteins were centrifuged for 1 hr. 
(8000 RPM, SorvaU GS3 Rotor) at 4°C and the resulting peUet dissolved in 
100 ml of buffer B (10 mM K/P0 4 , pH 7.2, 0.5 mM EDTA, 10 mM beta- 
mercaptoethanol, 50 mM NaCl, 10% glycerol, 0.05% Triton X-100, 50 ng/nd 
PMFS, 20 /tg/ml benzamidine, 2 pg/ml o-phenanthroline, 0.7 M g/ml pepstatin). 
The dissolved protein solution was then dialysed (14kD cut-off) for 12 hours 
against three 1 hter changes of buffer B. The dialyzed solution was then dUuted 
to 600 ml with buffer B and applied to a 5 x 20 cm phosphoceUulose Pll 
(Whatman) column (flow rate 100 ml/hr). 
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The column was then washed with 1.5 liter of buffer B followed 
by a 0 - 1.5 M NaCl gradient in buffer B (5 liters). R.CviJl eluted at 
approximately 600 mM NaCl. The active fractions were then pooled and 
concentrated to 50 ml with a 76 mm Amicon YM10 membrane. The resulting 
solution was then diluted to 300 ml with buffer C (20 mM Tris-acetate, pH 7.4 
at 4 C, 2 mM EDTA, 10 mM beta-mercaptoethanol, 50 mM NaCl, 10% 
glycerol, 0.01 % Triton X-100, 50 /zg/ml PMFS, 20 /xg/ml benzamidine, 2 M g/ml 
o-phenanthroline, 0.7 ng/ml pepstatin) and applied to 2.5 x 7 cm Heparin- 
Sepharose column at a flow rate of 25 ml/hr. 

After a 400 ml wash with buffer B, R.CWJI was eluted with a 
1.5 titer gradient of 0 - 1.3 M NaCl in buffer C. CWJI eluted at approximately 
400 mM NaCl. The most active fractions were pooled and applied to a 
2.5 x 7 cm Blue-agarose column equiubrated in buffer D(20 mM Tris-acetate pH 
8.0, 1 mM EDTA, 7 mM beta-mercaptoethanol, 30 mM NaCl, 10% glycerol, 
0.01% Triton X-100, 50 M g/ml PMFS, 20 M g/ml benzamidine, 2 M g/ml 
o-phenanthroline, 0.7 M g/ml pepstatin). After a 500 ml wash with buffer D, CviJI 
was eluted with a 0 - 1.5 M NaCl gradient (1.5 1) in buffer D. Active fractions 
were dialyzed against buffer G (10 mM K/P04 pH 7.0 (4 C Q, 10 mM beta- 
mercaptoethanol, 50 mM NaCl, 10% glycerol, 0.01% Triton X-100, 50 M g/ml 
PMFS, 20 fig/ml benzamidine, 2 M g/ml o-phenanthroline, 0.7 M g/ml pepstatin) 
and loaded (20 ml/h) onto a ceramic HTP column (American International 
Chemical, Natick MA) (1.5 x 3 cm), equilibrated in buffer F (20 mM Tris-HCl 
pH 8.0, 0.5 mM EDTA, 3 mM DTT, 50 mM K-acetate, 5 mM Mg acetate, 50% 
glycerol). After washing with 100 ml of buffer F, a 400 ml gradient 0 - 0.9 M 
K/P0 4 in buffer F was run. The HTP column was washed with buffer G, 
containing 3 mg/ml BSA, then with 1 M phosphate buffer and reequilibrated in 
buffer G. The active fractions were then pooled and concentrated using a TM10 
membrane to a final volume of 3 - 4 ml. This concentrate was then applied to a 
2.5 x 95 cm Sephadex G-100 column, equilibrated in buffer E (20 mM Tris-HCl 
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pH 7.5 (4°C), 5 mM Mg-Acetate, 2 mM EDTA, 10 mM beta-mercaptoethanol, 
100 mM NaCl, 5% glycerol, 0.01% Triton X-100, 50 /ig/ml PMFS, 20 /ig/ml 
benzamidine, 2 jig/ml o-phenanthroline, 0.7 pg/ml pepstatin) at a flow rate of 
6 ml/hr f and 3 ml fractions collected. Active fractions were dialyzed against 

5 storage buffer F. 

The molecular weight of the purified CVzJI was determined by 
comparison to known protein standards on a denaturing 10% SDS polyacrylamide 
gel and a single band migrating with an apparent molecular weight of 32.5 
kilodaltons was seen indicating that by these criteria, CviSl was purified to 

10 homogeneity. 



Example 6 

N-Tenninal Amino Acid Sequence of R.CviJI 

To confirm that the restriction endonuclease encoded by the insert 
in pCJH1.4 was CViJI the sequence of the first 15 N-terminal amino acids of • 
15 purified CviJl was determined by the Edman degradation method using an Applied 
Biosystems (Foster City, CA) 477A Liquid Phase Protein Sequencer with an on- 
line 120A PTH Analyzer. The results of that analysis are shown in Table 1. 
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Table 1 

N-Tenninal Amino Acid Analysis of CviJI 

Amino Retention pmol Pmol Pmol Pmol Amino Acid ID 
Acid# Time (Raw) (-bkgd) (+lag) Ratio 
(min) 



5 



10 



15 



1 
1 


y.iv 


6.11 


3.86 


5.10 


34.53 


THR, MET, 
ARG. OR LYS 


2 


10.32 


3.92 


1.54 


1 82 


0 Qfi 

7.7V) 


VJi-U 


3 


10.33 


4.28 


2.22 


2.18 


11.96 


GLU 


4 


27.37 


2.23 


1.49 


1.72 


7.64 


LYS 


5 


27.35 


2.37 


1.66 


1.67 


7.39 


LYS 


6 


17.95 


3.37 


2.76 


2.81 


9.48 


ARG 


7 


28.10 


3.19 


1.73 


2.08 


6.09 


LEU 


8 


13.58 


3.58 


2.11 


2.49 


12.08 


ALA 


9 


28.10 


3.23 


1.68 


1.58 


4.63 


LEU 


10 


18.17 


0.71 


0.78 


0.36 


1.21 


ILE 


11 


10.30 


1.65 


0.78 


0.96 


5.26 


GLU 


12 


9.72 


8.03 


0.41 


1.31 


3.25 


LYS 


13 


8.53 


1.54 


0.53 


0.55 


2.97 


GLN 


14 


18.18 


2.19 


1.74 


1.67 


5.63 


ARG 


15 


26.80 


3.33 


0.43 




0.89 


ILE 



Abbreviations used: threonine (THR), methionine (MET), arginine (ARG), lysine 
(LYS), glutamic acid (GLU), leucine (LEU), alanine (ALA), isoleucine (ILE) and 
glutamine (GLN). 



The results of this analysis confirm that the protein encoded by the 
DNA insert in pCJH1.4 (ORF1) is CviJI. 
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The following Examples illustrate some of the unique properties of 
and important uses for CvUI. 

Example 7 
Analysis of CvUI* Recognition Sequences 

The CVJI* recognition sequence (see Xia, et a/., Nuc. Acids Res. 
15: 6025-6090, 1987) was deduced by cloning and sequencing CviSl* digested 
pUC19 DNA fragments. A complete CVfJI* digest of pUC19 was ligated to an 
M13mpl8 cloning derivative for nucleotide sequence analysis. The sequence of 
the entire insert was read in order to determine which sites were or were not 
utilized. A total of 100 clones were sequenced, resulting in 200 CviJI* restricted 
junctions, the data for which are compiled in Table 2. 
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The dinucleotide GC is found at 205 sites in pUC19. These GC 
sites (shown in Table 2) can be divided into four classes based on their flanking 
Pu/Py structure, the normal recognition sequence (N) and three potential classes 
of relaxed sites (R2 and R3). As seen in Table 2, the fraction of such NGCN 

5 sites which belong to each classification is roughly equal (22.0%-27.8%). A total 
of 200 CWJI* restricted junctions were analyzed by sequencing 100 cloned inserts. 
If CWJI* cleaved at all NGCN sites without sequence preferences, it would be 
expected that the fraction of each classification should be restricted approximately 
equally. Instead, most of the sites cleaved by this treatment were found to be 

10 normal, or PuGCPy sites (47.5%). Rl (PyGCPy) and R2 (PuGCPu) restricted 
sites were found at nearly the same frequency (25.5% and 27.0%, respectively). 
Out of 200 CVJI* junctions, no R3 (PyGCPu) restricted sites were found. Thus, 
CviJI* cleaves all NGCN sites except for PyGCPu. As CVOI* cleaves 12 out of 
16 possible NGCN sites, it may be referred to as a 2.25-base recognition 

15 endonuclease. 

In addition to the restricted sites, those sites which were not cleaved 
by CViJI* conditions were also compiled for analysis, as shown in Table 2. A 
total of 116 non-cleaved NGCN sites were found in the 100 inserts which were 
sequenced. PyGCPu sites represented the largest class of non-cleaved sites 
20 (52.6%). In only two cases were PuGCPy sites found not to be cleaved. An 
approximately equal fraction of Rl and R2 sites were not cleaved as were found 
cleaved (22.4% versus 25.5% for Rl and 23.3% versus 27.0% for R2). Based 
on the frequency of cleavage, or lack thereof, a hierarchy of restriction under 
CViJI conditions is evident, where PuGCPy > > PuGCPu ■ PyGCPy. 
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Example 8 

CviJI* Restriction Generated Oligonucleotides 

Due to the high frequency of CvOl or CWJI* restriction, it is 
possible to generate useful oligonucleotides by digestion and a heat denaturation 
step as described above. The size and number of the resulting oligonucleotides 
are important for subsequent applications such as those described above. If for 
example, an oligonucleotide is to be used with a large genome, it has to be long 
enough so that the sequence detected has a probability of occuring only once in 
the genome. This minimum length has been calculated to be 17 nucleotides for 
the human genome (Thomas, C.A., Jr. Prog. Nucl. Acid Res. Mol. Biol, 5:315 
(1966)). Oligonucleotides used for sequencing or PCR amplification are generally 
17-24 bases in length. Oligomers of shorter length will often bind at multiple 
positions, even with small genomes, and thus will generate spurious extension 
products. Thus, an enzymatic method for generating oligomers should ideally 
result in polymers greater than 18 bases in length. 

The theoretical number of pUC19 CwJI* restriction-generated 
oligomers is 314 (157 CWJI* restriction fragments x 2 oligomers/fragment), the 
size distribution of which is shown in panel A of Figure 5. Most of the expected 
CwJI* restriction-generated oligomers (about 75%) are smaller than 20 bp. This 
assumes that CwJI is capable of restricting DNA to very small fragments, the 
shortest of which would be 2 bp. However, in practice, about 93% of the cloned 
CwJI* fragments were 20-56 bp in size, and 3% of the fragments generated by 
CvOl* were smaller than 20 bp (panel B of Figure 5). This suggests that CWJI* 
is not able to bind or restrict those fragments below a certain threshold length. 
Since the smallest observed fragment is 18 bp, it may be assumed that this length 
is the minimal size which can be generated from a given larger fragment. 
Whatever the reason for this phenomenon, CwJI* treatment of DNA produces a 
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relatively small range of oligomers (mostly 20-60 bases in length), most of which 
are a perfect size class for molecular biology applications. 

Example 9 
Anonymous Primer Cloning 

Primers are critical tools in many molecular biology applications 
such as PCR, sequencing, and as probes. Anonymous primers are useful as 
sequencing primers for genomic sequencing projects, as probes for mapping 
chromosomes, or to generate oligonucleotides for PCR amplification. 

The Anonymous Primer Cloning (APC) method is a variation of 
shotgun cloning in that unknown sequences of DNA are being randomly cloned. 
However, unlike CviSl shotgun cloning, wherein a partial CviSl** digest of DNA 
is cloned, anonymous primer cloning utilizes a complete Cvfll* digest to restrict 
large DNAs into small fragments 20-200 bp in size. These small fragments are 
cloned into a unique vector designed for excising the anonymous DNA as labeled 
15 primers. The strategy for this method is illustrated in Figure 6. 

As illustrated in Figure 6, the APC strategy reduces large DNAs 
to small fragments, which are cloned and excised for use as primers. Plasmid 
pFEM has a unique arrangement of the restriction sites for MboU and Fold, which 
permits DNA cloned into the EcoRV site to be excised without associated vector 
20 DNA. This is possible because Fokl cleaves 9/13 bases to the left of the 
recognition site shown in pFEM and MboU cleaves 8/7 bases to the right of the 
recognition site shown in pFEM, which is well into the cloned anonymous 
sequence. After MboU or Foil restriction, a known flanking primer is annealed 
(primer 1 or 2) and extended using a DNA polymerase and dNTPs. Th* nrimpr 
25 is previously end-labeled, or alternatively, one or more 
radioactive. 
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After denaturation of the newly synthesized DNA and separation 
from its cognate template, the labeled anonymous primer is ready for use in 
sequencing the original template from which it was subcloned. The presence of 
the pFEM vector sequence fused to the anonymous sequence does not influence 
the enzymatic extension of this primer from its unique binding site, as the vector 
DNA is at the 5' end and the unique sequence is located at the 3' end (all 
polymerases extend 5' to 3'). Both the top and bottom strand primers may be 
excised from pFEM due to the symmetrical placement of restriction sites and 
flanking primer binding sites. Thus, two primers may be derived from each 
cloning event. APC is particularly well suited to the genomic sequencing strategy 
of Church and Gilbert Proc Natl. Acad ScL USA 81:1991-1995 (1984), although 
its utility is not limited thereto. 

Example 10 

End Labeling of Restriction-Generated Oligonucleotides 

As is clear from the foregoing examples, digesting DNA with 
CviH provides the ability to generate sequence-specific oligonucleotides ranging 
in size from 20-200 bases in length with an average length of 20-60 bases. 
Sequence specific oligonucleotides generated by CviJl* digestion may be labeled 
directly at the 5'-end or at the 3'-end using techniques well known in that art. 

For example, 5'-end labeling may be accomplished by either a 
forward reaction or an exchange reaction using the enzyme T4 polynucleotide 
kinase. In the forward reaction, 32 P from [ 7 32 P)ATP is added to a 5' end of an 
oligonucleotide which has been dephosphorylated with alkaline phosphatase using 
standard techniques widely known in the art and described in detail in Sambrook 
et al. Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold Spring 
Harbor Laboratory Press (1989). In an exchange reaction, an excess of ADP 
(adenosine diphosphate) is used to drive an exchange of a 5'-terminal phosphate 
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from the sequence specific oligonucleotide to ADP which is followed by the 
transfer of 32 P from t 32 P-ATP to the 5'-end of the oligonucleotide. This 
reaction is also catalyzed by T4 polynucleotide kinase and is decribed in 
Sambrook et ai, Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold 
5 Spring Harbor Laboratory Press (1989). 

Homopoly meric tailing is another standard labeling technique useful 
in the labeling of CVz7I*-generated sequence specific oligonucleotides. This 
reaction involves the addition of 32 P-labeled nucleotides to the 3 '-end of the 
sequence specific oligonucleotides using a terminal deoxynucleotide transferase. 

10 (Sambrook et al. . Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold 
Spring Harbor Laboratory Press (1989)). 

Commonly used labeling techniques typically employ a single 
oligonucleotide directed to a single site on the target DNA and containing one or 
a few labels. Oligonucleotides generated by the method of the present invention 

15 are directed to many sites of a target DNA by virtue of the fact that they are 
generated from a sample of the target sequence. Thus, the hybridization of 
multiple oligonucleotides (labeled by the methods described above) allows a 
significantly enhanced sensitivity in the detection of target sequences. In addition, 
the short length of the labeled oligonucleotides used in the methods of the present 

20 invention allows a reduction in hybridization time from overnight (as is used in 
conventional methods) to 60 mins. 

Although labeling sequence specific oligonucleotides with 32 P is 
described above, labeling with other radionucleotides, and non-radioactive labels 
is also within the scope of the present invention. 
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Example 11 
Primer Extension Labeling of DNA Using 
Restriction-Generated Oligonucleotides (PEL-RGO) 

Another aspect of the present invention includes methods for 
labeling DNA which include the generation of oligonucleotide primers by 
complete digestion with CviJI*, followed by heat denaturation. PEL-RGO 
requires three steps: 1) generating the sequence-specific oligonucleotides by CvHl" 
restriction of the template DNA; 2) denaturation of the template and primer; and 
3) primer extension in the presence of labeled nucleotide triphosphates. Plasmid 
DNA may be prepared by methods known in the art such as the alkaline lysis or 
rapid boiling methods (Sambrook et al. Molecular Cloning: A Laboratory 
Manual. 2nd Edition). Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York (1989)). In addition, the vector should be linearized to ensure 
effective denaturation. A restriction fragment may be labeled after separation on 
low melting point agarose gels by methods well known in the art. ~ 

In PEL-RGO labeling, template DNA to be labeled is divided into 
two aliquots; one is used to generate the sequence specific oligonucleotide primers 
and the other aliquot is saved for the primer annealing and extension reaction. 
A typical reaction mix for generating sequence-specific oligonucleotides is 
assembled in a microcentrifuge tube and includes: 100 ng DNA; 2 *d 5x CwJI* 
buffer; 0.5 ul CwJI (lu/ M l); sterile distilled water to 10 pi final volume. CviJI* 
5X restriction buffer includes: 100 raM glycylglycine (Sigma, St. Louis, 
Missouri, Cat No. G2265) pH adjusted to 8.5 with KOH, 50 mM magnesium 
acetate (Amresco, Solon, Ohio, Cat. No. P0013119), 35 mM /S-mercaptoethanol 
(Mallinckrodt, Paris, Kentucky, Cat. No. 60-24-2), 5 mM ATP, 100 mM 
dithiothreitol (Sigma, St. Lous, Missouri, Cat. No. D9779) and 25% v/v DMSO, 
(Mallinckrodt Cat. No. 67-68-5). CviJI is obtained from CHIMERx (Madison, 
Wisconsin). The reaction mix is incubated at 37°C for 30 min, followed by the 
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inactivation of CwJI by heating at 65°C for 10 min. The CVtfl "-restricted DNA 
may be used directly without further purification, or it may be stored at -20°C for 
several months for subsequent labeling reactions. 

After heat-inactivating CwJI, 0.2 /xg of the digested and undigested 
DNA are electrophoresed on a 1.5% agarose gel, using a suitable molecular 
weight marker for comparison. The CWJI restriction fragments appear as a low 
molecular weight smear in the 20-200 bp range. 

By way of example, 1-10 ng of linearized pUC19 was labeled under 
the conditions described below. A template-primer cocktail was prepared by 
mixing 10 ng of linearized pUC19 DNA template with 20 ng pUC19 sequence- 
specific oligonucleotides (prepared as described above) and the mixture is brought 
to a final volume of 17 M l with sterile distilled water. The template-primer 
mixture is denatured in a boiling water bath for 2 minutes and immediately placed 
on ice. 

The following labeling mixture is then added to the template-primer 
mix:2.5 jd 10X labeling buffer (500 mM Tris HC1 at pH 9.0, 30 mM MgCl 2 , 
200 mM (NH 4 ) 2 S0 4 , 20^M dATP, 20jiM dTTP, 20/iM dGTP, 0.4% NP-40); 
5.0 id [a- 32 Pl dCTP (3000Ci/mmol, 10^Ci/ M l New England Nuclear, Catalog 
No. NEG013H); 0.5 /d Thermus flaws DNA polymerase (5u/jd) (Molecular 
Biology Resources, Milwaukee, Wisconsin); up to 25 pi final volume with 
distilled water. The reaction was incubated at 70°C for 30 min and then stopped 
by adding 2 M 1 of 0.5M EDTA at pH 8.0 to the reaction mix. 

The efficiency of the labeling reaction is gauged by the percentage 
of radioisotope incorporated into labeled DNA. One microliter of the labeling 
reaction is added to 99 p\ of lOmM EDTA in a microcentrifuge tube. This serves 
as the source of diluted probe for total and trichloroacetic acid (TC A)-precipitable 
counts. 2 ii\ of diluted probe is spotted onto the center of a glass fiber filter disc 
(Whatman number 934-AH). The disc is then allowed to dry and is then placed 
in a vial containing scintillation cocktail for counting total radioactivity in a liquid 
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scintillation counter. Another 2 /d aliquot from the diluted probe is added to 1 
ml of 10% ice cold TCA followed by the addition of 2 (tl of carrier bovine serum 
albumin (BSA). This mixture was then placed on ice for 10 minutes. The 
precipitate is then collected on a glass filter disc (Whatman No. 934-AH) by 
vacuum filtration. The filter is then washed with 20ml of ice cold 10% TCA, 
allowed to dry and is placed in a vial containing scintillation cocktail and counted. 

Because primer extension oligonucleotide labeling results in net 
DNA synthesis, the specific activity of labeled DNA is calculated using the 
following guidelines. 

Total cpm incorporated = TCA cpm X 50 X 27 

Wherein the factor 50 is derived from using 2 /d of a 1:100 dilution for TCA 
precipitation. The number 27 converts this back to the total reaction volume 
(which is the reaction volume plus 2 pi of stop solution). 

Synthesized DNA (ng of DNA synthesized) - 
theoretical yield X fraction of radioactivity incorporated. 

Theoretical yield (ng of DNA) = uCi dNTPs add ed x 4 Y ^ftn p /n r ni» 

specific activity dNTP(Ci/mmole=MCi/nmole) 

Fraction of incorporated label ■ TCA precipitated cpm/ total cpm. 

Specific activity (cpm/pg of DNA) = total com incorporated x 1000 

synthesized DNA + input DNA 

Wherein 1000 is the factor converting nanograms to micrograms. 
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By way of example, the following represents the calculation of 
specific activity for an aliquot of pUC19 DNA labeled using this method. Using 
50 fid of [a- 32 P]dCTP in a 25 ^1 reaction, and if the TCA precipitated cpm is 
26192 and total cpm is 102047; 



5 Total cpm incorporated = 26192 X 50 X 27 =3.27 x 10 7 cpm 

Synthesized DNA (ng of DNA synthesized) = 
Theoretical yield X fraction of radioactivity incorporated. 



Theoretical yield = uCi of dNTPs x 4 x 330 

3000 /iCi/nmole 

10 =50 jxCix 4X330 

3000 

= 22 ng 

Fraction of label incorporated = TCA precipitated cpm = 26192 = 0.256 

Total cpm 102047 



15 Synthesized DNA = 22 X 0.256 

= 5.6 ng 



Specific activity (cpm lue)= Total cpm incorporated x 1000 

Synthesized DNA + input DNA 

Input DNA = 10 ng 

20 Specific activity = 3.27 x 10 7 x 1000 

5.6+10 
=2.09 x 10 9 cpm//ig 



Unincorporated radioactive label may be removed using standard 
methods well known in the art. 
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Comparisons were made between PEL-RGO vs RPL under similar 
conditions, and it was observed that a detection limit of 100 fg was seen using 
PEL-RGO labeled DNA compared to a detection limit of 500 fg with RPL, using 
a radiolabeled probe. 

Example 12 

Thermal Cycle Labeling and Universal Thermal Cycle Labeling 

Thermal Cycle Labeling (TCL) is a method according to the present 
invention for efficiently labeling double-stranded DNA while simultaneously 
amplifying large amounts of the labeled probe. TCL of DNA requires two 
general steps: 1) generation of the sequence-specific oligonucleotides by CvoT 
restriction of the template DNA; and 2) repeated cycles of denaturation, 
annealing, and extension in the presence of a thermostable DNA polymerase or 
a functional fragment thereof which maintains polymerase activity. Optimal 
results are obtained after 20 such cycles, which is best performed in an automated 
thermal cycling instrument such as a Perkin-Elmer Model 480 thermocycler. In 
conjunction with such an instrument, about 1.5 hr. is required to complete this 
protocol. If a thermal cycler is not available these reactions may be performed 
using heat blocks. As few as 5 cycles may yield probes with acceptable detection 
sensitivities. The generation of sequence specific oligonucleotides for use in this 
method may also be accomplished using the restriction endonuclease reagent 
CGase I described in Example 20 or the restriction endonuclease Aci I which has 
as a recognition sequence CCGC. 

Non-radioactive labeling of DNA using TCL is accomplished by 
mixing: 10 pg - 100 ng linearized template, 50 ng CVijT-digested primers 
(prepared as described above), 1.5 id 10X labeling buffer, 0.5 M l Thermus flaws 
DNA polymerase (5u/ M l) (Molecular Biology Resources, Inc., Milwaukee, 
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Wisconsin), 1 /d of ImM Biotin-ll-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 fd each of dATP, dCTP, and dGTP (2 mM), and 1.0 jd 2mM dTTP. 

Radioactive labeling of DNA using TCL was accomplished by 
mixing 10 pg - 100 ng of CviJI generated primers, 10 pg-25 ng of linearized 
5 template, 1.5 jd of 10X labeling buffer, 5 jd of 32 P-dCTP (3000 Ci/mmole, 10 
fiCi/id or 40 fiCi/fil), 0.5 /d of Thermus flavus DNA polymerase (5u/jd), and 0.5 
/d each of dATP, dGTP, and dTTP (1 mM) was added. The reaction mix was 
brought to a volume of 15 /d with deionized H 2 0, overlaid with mineral oil and 
cycled through 20 rounds of denaturation, annealing and extension. A typical 

10 cycling regimen employed 20 cycles of denaturation at 91°C for 5 sec, annealing 
at 50°C for 5 sec and extension at 72°C for 30 sec. The reaction is then 
terminated by adding 1 /d of 0.5M EDTA, pH 8.0. The amplified, labeled probe 
is a very heterogeneous mixture of fragments, which appears as a smear when 
analyzed by agarose gel electrophoresis. 

15 Universal thermal cycle labeling (UTCL) is a method according to 

the present invention for efficiently labeling double-stranded DNA while 
simultaneously amplifying large amounts of labeled probe. UTCL is unique in that 
no sequence information is required regarding the template. The extension 
primers are suppled endogenously via the holo-enzyme of the thermostable DNA 

20 polymerase and any anonymous DNA template can be labeled by repeated cycles 
of denaturation, annealing, and extension in the presence of a labeled 
deoxynucleotide triphosphate. Optimal results are obtained after 20 such cycles, 
which is best performed in an automated thermal cycling instrument such as a 
Peririn-Elmer Model 480 thermocycler. In conjunction with such an instrument, 

25 about 1.5 hr are required to complete this protocol. If a thermal cycler is not 
available these reactions may be performed using heat blocks. As a few as 5 
cycles may yield probes with acceptable detection sensitivies. 

Non-radioactive labeling of DNA using UTCL is accomplished by 
mixing: 10 ng linearized template, 1.5 /d 10X labeling buffer, 0.5 /d Thermus 
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Jlavus DNA polymerase (5u//xl) (Molecular Biology Resources, Inc., Milwaukee, 
Wisconsin), 1 M l of ImM Biotin-ll-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 nl each of dATP, dCTP, and dGTP (2 mM), and 1.0 pi 2mM dTTP. 

Radioactive labeling of DNA using UTCL was accomplished by 
mixing: 10 pg-100 ng of linearized template, 1 .5 id of 10X labeling buffer, 5 M l 
of 32 P-dCTP (3000 Ci/mmole, 10 /iCi/ M l or 40 M Ci/ M l), 0.5 M l of Thermus flavus 
DNA polymerase (5u/ M l), and 0.5 jd each of dATP, dGTP, and dTTP (1 mM) 
was added. The reaction mix was brought to a volume of 15 jd with deionized 
H 2 0, overlaid with mineral oil and cycled through 20 rounds of denaturation, 
annealing and extension. A typical cycling regimen employed 20 cycles of 
denaturation at 91 °C for 5 sec, annealing at 50°C for 5 sec and extension at 72°C 
for 30 sec. The reaction is then terminated by adding 1 pi of 0.5M EDTA, pH 
8.0. The amplified, labeled probe is a very heterogeneous mixture of fragments, 
which appears as a smear when analyzed by agarose gel electrophoresis. 

Estimation of Bio-1 1 HTrrp incorp orating 
In order to estimate the level of incorporation of biotin-ll-dUTP 
into DNA, a serial dilution from 1:10 to 1:10 8 of the labeled probe (free of 
unincorporated biotin-1 1-dUTP) is made in TE (lOmM Tris, ImM EDTA, pH 8). 
A microliter of each dilution is placed on a neutral nylon membrane, and the 
DNA sample is bound to the membrane either by UV cross linking for 3 min or 
by baking at 80°C for 2 hr. 

The unbound sites on the membrane are blocked using a blocking 
buffer for 15 min at 25°C. Streptavidin-alkaline phosphatase (Gibco-BRL 
Gaithersburg, Maryland, Cat. No. 9545A) is added to the blocking buffer (0.058 
M Na 2 HP0 4 , 0.017 M NaH 2 P0 4 , 0.068 M NaCl, 0.02% sodium azide, 0.5% 
casein hydrolysate, 0.1% Tween-20) at a 1:5000 dilution and incubated for a 30 
min., and the membrane is rinsed 3 times for 10 min. each with wash buffer (lx 
PBS [0.058 M Na 2 HP0 4 , 0.017 M NaH 2 P0 4 , 0.068 M NaCl], 0.3% Tween, 
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0.2% sodium azide), rinsed briefly (5 minutes) with AP buffer (100 mM NaCI, 
5 mM MgCl 2 » 100 mM Tris-Cl pH 9.5) and then enough AP buffer containing 
4.0 jil/ml nitro blue tetrazolium (NBT) (Sigma Cat. No. N6639), (Sigma Cat. No. 
B6777), and 3.5 tiVml of 5-bromo-4-chloro-3-indolyl phosphate (BCIP) was added 
5 in order to cover the membrane. The membrane is left in the dark for 
approximately 30 minutes or until the reaction is complete. The reaction is 
stopped by rinsing in 1 X PBS. 

Dgtectjpp Sepsitivitfes 
32 P-labeled probes generated by the protocol above described 

10 labelling detect as little as 25 zeptomoles (2.5 x 10~ 20 moles) of a target 
sequence. As little as 10 pg of template DNA is enough to synthesize 5-10 ng of 
radiolabeled probe, which is sufficient for screening 5 Southern blots. The 
radioactive versions of TCL and UTCL facilitate extremely high specific activities 
of labeled probe (about 5 x 10 9 cpm/pg DNA), which permits 5-10 fold lower 

15 detection limits than conventional labeling protocols. The synthesis of higher 
specific activity probes is probably the net result of the sequence-specific 
oligonucleotide primers and their increased length when compared to the short 
random primers used in other labeling methods. In addition, the thermal cycling 
permits probe amplification. 

20 Biotin-labeled probes generated by the TCL and UTCL protocols 

detect as little as 25 zeptomoles (2.5 x 10" 20 moles) of a target sequence. A 15 
id TCL or UTCL reaction yields as much as 5-10 jig of labeled DNA, enough to 
probe 5 to 10 Southern blots. Biotin-labeled TCL and UTCL probes provide a 
10 fold greater detection sensitivity when compared to RPL biotin probes. In 

25 addition, the thermal cycling permits probe amplification. 

Non-radioactive, biotinylated probes labeled by the TCL and UTCL 
methods were shown to have detection limits that are identical to the radioactive 
probes. These methods have the advantage of eliminating the need to work with 
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hazardous radioactive materials without sacrificing sensitivity. In addition, results 
ore obtained from non-isotopic probes in 3-4 hours compared to 3-4 days for 
radiolabeled probes. The ability to substitute non-radioactive probes for 
radioactive probes may be very useful to clinical laboratories, which do not use 
radioisotopes but do need greater detection sensitivities. Research laboratories 
favor the use of non-isotopic systems if detection sensitivity is not an issue. The 
non-isotopic labeling version of the TCL and UTCL systems represent a major 
improvement in labeling DNA probes. Non-radioactive probes generated by the 
methods of the present invention are also useful in the detection of RNA in situ. 
An advantage of this system is that labeling protocols of the present invention 
yield highly sensitive non-radioactive probes, and the size of the probes are 
predominantly in the small molecular weight range and can therefore penetrate the 
tissue easily, unlike RPL. Because non-radioactive probes labeled using the 
labeling protocols of the present invention have the same detection limits as do 
radioactive probes similarly labeled, it is within the scope of this invention to use 
either radioactive or non-radioactive probes for probing, for example, Southern 1 
blots, Northern blots, for in situ hybridization for the detection of mRNA or DNA 
in cells or tissue directly, and for colony or plaque lifts. 

Example 13 
Quasi-Random Fragmentation of DNA 

Shotgun cloning and sequencing requires the generation of an 
overlapping population of DNA fragments. Therefore, conditions were 
established for the partial digestion of DNA with CwJI to produce an apparently 
random pattern, or smear, of fragments in the appropriate size range. 
Conventional methods for obtaining partially restricted DNA include limiting the 
incubation time or limiting the amount of enzyme used in the digestion. Initially, 
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agarose gel electrophoresis and ethidium bromide staining of the treated DNA 
were utilized to assess the randomness and size distribution of the fragments. 

CvOl was obtained from CHIMERx (Madison, Wisconsin). 
Digestion of pUC19 DNA for limited time periods, or with limiting amounts of 
S CviJl under normal or relaxed conditions, did not produce a quasi-random 
restriction pattern, or smear. Instead, a number of discrete bands were observed, 
as shown in Figure 7, lane 3 for the CViJI* partial digestion of pUC19. Complete 
digests of pUC19 under normal and CWJI* buffer conditions are shown in lanes 
1 and 2 respectively. These results show that, under these relaxed conditions, 

10 CWJI has a strong restriction site preference. 

To eliminate the apparent restriction site preferences observed 
under the partial restriction conditions described above, a series of altered reaction 
conditions were explored. Conditions of high pH, low ionic strength, addition of 
solvents such as glycerol or dimethylsulfoxide, and/or substitution of Mn 2+ for 

15 Mg2+ were systematically tested with CvOl endonuclease using the plasmid 

pUC19. Figure 7 shows the results of these tests. In Lane M, a 100 bp DNA 
ladder was run. In Lanes 1-4, pUC19 DNA (1.0 Mg) was run after digestion at 
37°C in a 20 fi\ volume for the following times and conditions: Lane 1, complete 
CwJI digest (1 unit of enzyme for 90 min in 50 mM Tris-HCl, pH 8.0, 10 mM 

20 MgCl 2 , 50 mM NaCl); Lane 2, complete CwJI* digest (1 unit of enzyme for 90 
min in 50 mM Tris-HCl, pH 8.0,10 mM MgCl 2 , 50 mM NaCl, 1 mM ATP, 20 
mM DTT); Lane 3, partial CViJI* digest (0.25 units of enzyme for 30 min in 50 
mM Tris-HCl, pH 8.0, 10 mM MgCl 2 , 50 mM NaCl, 1 mM ATP, 20 mM 
DTT); Lane 4, partial CV/JI** digest (0.5 units of enzyme for 60 min in 10 mM 

25 Tris-HCl, pH 8.0, 10 mM MgCI 2 , 10 mM NaCl, 1 mM ATP, 20 mM DTT, 20% 
v/v DMSO); and Lane 5, uncut pUC19 (1.0 jxg). 

The digestion condition which yielded the best "smearing" pattern 
was obtained when the ionic strength of the relaxed reaction buffer was lowered 
and an organic solvent was added (Figure 7, lane 4). Plasmid pUC19 partially 
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digested under these conditions yields a relatively non-discrete smear. This 
activity is referred to as CwJl" to differentiate it from the originally- 
characterized star activity described in Xia et al. , Nucl. Acids Res. 15:6075-6090 
(1987). The appearance of diffuse, faint bands overlying a background smear 
5 generated from this 2686 bp molecule indicates that some weakly preferred or 
resistant restriction sites may bias the results of subsequent cloning experiments. 

DNA was mechanically sheared by sonication utilizing a Heat 
Systems Ultrasonics (Farmingdale, New York) W-375 cup horn sonicator as 
specified by Bankier et al., Methods in Enzymology 155:51-93 (1987). DNA 
10 fragmented by this method has random single-stranded overhanging ends (ragged 
ends). 

CvOT* digested, and sonicated samples were size fractionated by 
agarose gel electrophoresis and electroelution, or by spin columns packed with the 
size exclusion gel matrix, Sephacryl S-500 (Pharmacia LKB, Piscataway NJ.) to 
eliminate small DNA fragments. Spin columns (0.4 cm in diameter) were packed 
to a height of 1.3 cm by adding 1 ml of Sephacryl S-500 slurry and centrifuging 
at 2000 RPM for 5 minutes in a Beckman CPR centrifuge. The columns were 
rinsed 3 times with 1 ml aliquots of 100 mM Tris-HCl (pH 8.0) by centrifugation 
at 2000 RPM for 2 min. Typically, 0.2-2.0 ug of fragmented DNA in a total 
volume of 30 ul was applied to the column. The void volume, containing those 
DNA fragments larger than 500 bp, was recovered in the column eluant after 
spinning at 2000 RPM for 5 minutes. The capacity of this microcolumn 
procedure is 2 M g of DNA. Agarose gel electrophoresis and electroelution are 
described in detail by Sambrook et al. Molecular Cloning: A Laboratory Manual, 
Second Edition Cold Spring Harbor Laboratory Press, Cold Spring Harbor N. Y. 
(1989) and is well known to those skilled in the art. In these experiments, 5 ug 
of sample was pipetted into a 2 cm-wide slot on a 1% agarose gel. 
Bectrophoresis was halted after the bromophenol blue tracking dye had migrated 
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6 cm. Fragments larger than 750 bp, as judged by molecular size markers, were 
separated from smaller sizes and electrophoresed onto dialysis tubing (1000 MW 
cutoff). The fractionated material was extracted with phenol-chloroform and 
precipitated using ice cold ethanol (50% final volume) and ammonium acetate (2.5 
5 M final concentration). 

The ragged ends of the sonicated DNA were rendered blunt 
utilizing two different end repair reactions. In one end repair reaction (ER 1) 
sonicated DNA was treated according to the procedure outlined by Bankier et al. 
Methods in Enzymology 155:51-93 (1987), where 2.0 fig of sonicated lambda 

10 DNA is combined with 10 units of the Klenow fragment of DNA polymerase I, 
10 units T4 DNA polymerase, 0.1 mM dNTPs, (deoxynucleotide 
triphosphates =deoxyadenosine triphosphate, deoxthymidine triphosphate, 
deoxycytosine triphosphate, and deoxyguanosine triphosphate) and reaction buffer 
(50 mM Tris-HCl, pH 7.5,10 mM MgCl 2 , 10 mM DTT). This mixture was 

15 incubated at room temperature for 30 min followed by heat denaturation of the 
enzymes at 65°C for 15 minutes. In a second end repair reaction (ER 2), an 
excess of the reagents and enzymes described above were utilized to ensure a 
more efficient conversion to blunt ends. In this reaction, 0.2 fig of the sonicated 
lambda DNA sample was treated under the same reaction conditions described 

20 above. 

Figure 8 shows comparisons of the size distributions of sonicated 
DNA versus DNA that was partially digested with CVfJI**. In Lanes M, a 1 kb 
DNA ladder was run. In Lanes 1-3, untreated X DNA (0.25 /xg), sonicated X 
DNA (1.0 M g), and CwJI** partially-digested X DNA (1.0 fig) were run, 
25 respectively. In Lanes 4-6, untreated pUC19 (0.25 fig) 9 sonicated pUC19 (1.0 
Mg), and CVJI** partially-digested pUC19 (1.0 fig) were run, respectively. 

Fragmentation of a large substrate such as lambda DNA (45 kb) 
revealed essentially no banding differences between the CViJI** method and 
sonication, as demonstrated in Figure 8, lanes 2 and 3. In addition, pUC19 DNA 
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that was partially digested with CwJI** gave a size distribution or "smear" that 
closely resembled that achieved with sonication (Figure 8, lanes 5 and 6). As 
expected, the minor bias evident with a small molecule such as pUC19 was not 
detectable with a larger substrate such as lambda DNA. 

The intensity and duration of sonic treatment affects the size 
distribution of the resulting DNA fragments. The results obtained from the 
sonication of lambda and pUC19 samples (Figure 8) were obtained from three 20 
second pulses at a power setting of 60 watts. Sonication-generated smears are 
similar, although the size distribution of fragments is consistently greater with 
CvOl fragmentation. This result favors the cloning of larger inserts, which 
facilitates the efficiency of end-closure strategies (Edwards et al. , Genome 6:593- 
608 (1990)). The size distribution of the DNA fragmented by Cviri** is 
controlled by incubation time and amount of enzyme, variables which are readily 
optimized by routine analysis. An excess of enzyme or a long incubation time 
15 will completely digest pUC19 DNA, resulting in fragments which range in size 
from approximately 20 bp to approximately 150 bp (Figure 7, lanes 1 and 2). 
The results shown in Figure 8 were obtained by incubating pUC19 for 40 minutes 
and lambda DNA for 60 minutes with 0.33 units of Cvin//xg substrate. The 
efficiencies of the two methods for randomly fragmenting DNA were 
20 quantitatively analyzed for use in molecular cloning, as described below. 

Example 14 

Rapid DNA Size Fractionation Utilizing Spin Column Chromatography 

The amount of data obtained by the shotgun sequencing approach 
is substantially increased if fragments of less than 500 bp are eliminated prior to 
25 the cloning step. Small fragments yield only a portion of the sequence data which 
may be collected from polyacrylamide gel based separations and, thus, such small 
fragments lower the efficiency of this strategy. Agarose gel electrophoresis 
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followed by electroelution is commonly used to size fractionate DNA prior to 
shotgun cloning (Bankier et a!.. Methods in Enzymol. 155:51-93 (1987)). 
Approximately three hours are required to prepare the agarose gel, electrophorese 
the sample, electroelute fragments larger than 500 bp, perform phenol-chloroform 
extractions, and precipitate the resulting material. 

The results of 5 out of 9 independent trials size-fractionating 
CvOI -fragmented lambda DNA by agarose gel electrophoresis are shown in 
Figures 9A-E. Figures 9A-D illustrate the following. In Figure 9A: Lane M, 

1 kb DNA ladder, lane X, untreated X DNA (0.25 jtg); lane 1, unfractionated 
(UF) CVilT* partially-digested X DNA (1.0 M g); lane 2, column-fractionated (CF) 
CvOI** partially-digested X DNA (1.0 /ig); lane 3, gel-fractionated (GF) Cvill** 
partially-digested X DNA (1.0 ?g); and in Figures 9B-E are additional trials of the 
same treatments as in the lanes of Figure 9A which have the same label. 

Small DNA fragments may also be removed by passing the sample 
through a short column of Sephacryl S-500. Approximately 15 min. are needed 
to prepare the column and 5 min. to fractionate the DNA by this method. 

The results of three out of nine trials using a Sephacryl S-500 
column are shown in Figures 9A-C. The efficiency of eliminating small DNA 
fragments (<500 bp) by spin column chromatography appears high, and the 
reproducibility was excellent. This result is in contrast to the agarose gel 
electrophoresis and electroelution data presented in Figures 9A-E wherein nine 
replicate trials of this method yielded nine differently sized products, regardless 
of the source of the agarose. Both methods yielded 30-40% recoveries as 
measured by UV spectrophotometry. To quantitate the relative efficiencies of the 
two fractionation methods, the lambda DNA size fractionated in Figure 9A lanes 

2 and 3, and Figure 9B lane 3 were analyzed for cloning efficiency and insert 
size, as described below. 
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Ex ample 15 
Cloning Efficiencies of Gel Elution and 
Chromatography Fractionation Methods 

The efficacy of size selection was quantified by two criteria: 1) by 
comparing the relative cloning efficiency of CVflT* partially-digested lambda 
DNA fragments fractionated either by agarose gel electrophoresis and 
electroelution or micro-column chromatography, and 2) determining the size 
distribution of the resulting cloned inserts. To reduce potential variables, large 
quantities of the cloning vector and ligation cocktail were prepared, ligation 
reactions and transformation of competent E. coli were performed on the same 
day, numerous redundant controls were performed, and all cloning experiments 
were repeated twice. Ligation reactions were carried out overnight at 12°C in 20 
Ml matures using the following conditions: 25 mM Tris-HCl (pH 7.8), 10 mM 
MgCl 2 , 1 mM DTT, 1 mM ATP, DNA, and 2000 units of T4 DNA ligase. For 
unfractionated samples, 10 ng of fragments and 100 ng of fltncn-restricted, 
dephosphorylated pUC19 were combined under the above conditions. For 
Sephacryl S-500 fractionated samples, 50 ng of size-selected fragments were 
ligated with 100 ng of MncII-restricted, dephosphorylated pUC19. This increase 
in fractionated DNA was determined empirically to compensate for the lower 
concentration of "ends" resulting from the fractionation procedure and/or the 
lowered efficiency of cloning larger fragments. Ligation reaction products were 
added to competent*. co//DH5«F' (*80d/ocZAM15 A(/acZYA-argF)U169<feoR 
«yrA96 recAl relAl endAl thi-l fedR17(r K ",m K + ) supBH X-) in a 
transformation mixture as specified by the manufacturer (Life Technologies, 
Bethesda, Maryland) and aliquots of the transformation mixture were plated on 
T agar (Messing, Methods in Enzymol. 101:20-78 (1983)) containing 20 /ig/ml 
ampicillin, 25 M l of a 2% solution of isopropylthiogalactoside (IPTG) and 25 pi 
of a 2% solution of 5-dibromo-4-chloro-3-indolylgalactoside (X-GAL). The 
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cloning efficiencies reported are the average of triplicate platings of each ligation 
reaction. The concentration of the fractionated material was checked 
spectrophotometrically so that 50 ng was added to all ligation reactions. This 
material was ligated to Smell-digested and dephosphorylated pUC19. This 
5 cloning vector was chosen because it permits a simple blue to white visual assay 
to indicate whether a DNA fragment was cloned (white) or not (blue) (Messing, 
Methods in Enzymol 101:20-78 (1983)). 

A summary of the cloning efficiencies calculated from two 
independent trials is given in Table 3. 
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TABLE 3 

Cloning Efficiencies of CviJI** Partially Digested Lambda DNA 
Fractionated by Microcolumn Chromatography Versus Agarose Gel 
Electroelution. 



Trial I 



Trial H 



Colony Phenntvpe 



DNA/treatment 

Supercoiled pUC19 

pUC19/HincII/CIAP 
10 pUC19/HincII/CIAP/ 
T4 DNA ligase 

X/CvOI** partial/CF 
+ pUC19 

X/Cvffl** partial/GFEl 
15 + pUC19 

X/CvUI** partiaI/GFE2 

+ pUC19 



Blue 

55000 
210 
150 

140 

98 

82 



White 
<10 

<1 

4 

240 
49 
54 



Slug 
50000 
320 
210 

210 

200 

95 



White 
<10 

1 

7 

240 
18 
74 



20 



25 



Cloning efficiencies reflect the number of ampicillin-resistant 
colonies/ng pUC19 DNA. CIAP represents treatment with calf intestinal alkaline 
phosphatase used to dephosphorylate ff/ncll-digested pUC19 to minimize self- 
Ugation. CF refers to DNA that was fractionated on Sephacryl S-500 columns as 
described above. GFE1 and GFE2 refer to two runs wherein DNA was 
fractionated by agarose gel electrophoresis and electrocuted. X refers to 
bacteriophage X DNA. 

These trials represent repeated experiments in which X DNA 
fragments generated by CviJI** partial digestion were ligated to ffincll-linearized, 



WO 94/21663 PCT/US94/03246 



-62- 

dephosphorylated pUC19 and transformed into DHScr F' competent cells described 
above. The first three rows in Table 2 show controls performed to establish a 
baseline to better evaluate the various treatments. Supercoiled pUC19 transforms 
£. coli 10 times more efficiently than the /fincll-digested plasmid and 150-260 
5 times more efficiently than the /ffncll-digested and dephosphorylated plasmid. 
The number of blue and white colonies which resulted from transforming HincTL- 
cut and dephosphorylated pUC19 was determined both before and after treatment 
with T4 DNA ligase in order to differentiate these background events from 
cloning inserts. The background of blue colonies (which represent the uncut 

10 and/or non-dephosphorylated population of molecules) averaged 0,4% , compared 
to supercoiled plasmid. The background of white colonies (which presumably 
results from contaminating nucleases in the enzyme treatments or genomic DNA 
in the plasmid preparations) after flincH-digestion, dephosphorylation, and ligation 
of pUC19 averaged 0.014% as compared to the supercoiled plasmid. 

15 The number of white colonies obtained when micro-column 

fractionated DNA was cloned into pUC19 was 240/ng vector in both trials. The 
efficiency of cloning gel fractionated and electroeluted DNA ranged from 18-74 
white colonies/ng vector. The data show that column fractionated DNA results 
in three to thirteen times the number of white colonies, and presumably 

20 recombinant inserts, as gel fractionated and electroeluted DNA. The size 
distribution of the inserts present in these white colonies is depicted in Figures 
10A-C. In Figure 10A, a CvilT* partial digest of 2/tg of X DNA was size 
fractionated on a 4 mm by 13 mm column of Sephacryl S-500 at 2,000 x g for 5 
minutes. The void volume containing partially digested DNA was directly ligated 

25 to linear, dephosphorylated pUC19 and 43 resulting clones were analyzed for 
insert size. The DNA for this experiment is the same as that shown in Figure 
9A, lane 2. In Figure 10B, a (MI** partial digest of 5 ^g of X DNA was size 
fractionated by agarose gel electroelution. The eluted DNA was phenol-extracted 
and ligated to linear, dephosphorylated pUC19, and the resulting 40 clones were 
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analyzed for insert size. The DNA for this experiment is the same as that shown 
in Figure 9A, lane 3. In Figure IOC, the procedure is the same as in Figure 9B, 
except the DNA for this experiment came from Figure 9B, lane 3. 

A total of 43 random clones obtained from micro-column 
chromatography fractionation were analyzed for insert size (as shown in Figure 
10A). Most of these inserts were larger than 500 bp (37/43 or 86%), 11.6% 
(5/43) were smaller than 500 bp f and one clone (23%) was smaller than 250 bp. 
The average insert size was 1630 bp. These results are in contrast to those 
obtained by agarose gel fractionation (as shown in Figures 10B and 10C). In the 
first trial (Figure 10B) most of the inserts were smaller than 500 bp (26/37 or 
70.3%) and only 29.7% (11/37) were larger than 500 bp in size. In the second 
trial (Figure 10C) all of the inserts (40 total) were smaller than 500 bp. Thus, 
the use of agarose gel electroelution for the size fractionation of DNA results in 
unexpectedly variable and low cloning efficiencies. 

Example 16 

Cloning Sonicated and CviJI**-Digested Lambda DNA 

To compare the cloning efficiencies of sonicated and CwJI**- 
digested nucleic acid, X DNA was fragmented by each of these methods and 
ligated to pUC19 which was linearized with HincTI and dephosphorylated to 
minimize self-ligation. 

DNA fragmented by CVfJI digestion and sonication was cloned 
both before and after Sephacryl S-500 size fractionation. Sonicated lambda DNA 
was subjected to an end repair treatment prior to ligation. Ligations were 
performed as described in Example 11. One-tenth of the ligation reaction (2 jd) 
was utilized in the transformation procedure, and the fraction of nonrecombinant 
(blue) versus recombinant (white) colonies was used to calculate the efficiency of 
this process. 
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The efficacy of the methods was quantified by comparing the 
cloning efficiency of lambda DNA fragments generated either by sonication or 
CvUl * partial digestion. To reduce potential cloning differences based on size 
preference, the size distribution of the DNA generated by these two methods was 
closely matched. Other experimental details were designed to reduce potential 
variables, as described above. Certain variables were unavoidable, however. For 
example, the sonicated DNA fragments required an enzymatic step to repair the 
ragged ends as described in Example 1 prior to ligation, whereas the CwJI** 
digests were heat-denatured and directly ligated to Hindi digested pUC19. 

A summary of the cloning efficiencies calculated from two 
independent trials is given in Table 4, section A (unfractionaled samples), and 
Section B (fractionated samples). 
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Cloning efficiencies represent the number of ampicillin-resistant 
colonies/ng pUC19 DNA. CIAP indicates treatment with calf intestinal alkaline 
phosphatase. ER 1 and ER 2 are end repair methods described in Example 13. 
X refers to bacteriophage lambda. 

The indicated trials represent repeated experiments in which two 
identical sets of lambda DNA fragments generated by AM complete digestion, 
CvOl partial digestion, or sonication were each ligated to H/ncII-linearized, 
dephosphorylated pUC19 and transformed into DH5aF' competent cells. The 
cloning efficiencies reported are the average of triplicate platings of each ligation 
reaction. In case the Sephacryl S-500 size fractionation step introduced inhibitors 
of ligation or transformation or resulted in differences attributable to the size of 
the material, the sonicated and CwJI**-digested samples were ligated with pUC19 
both prior to (A) and after (B) the fractionation steps. The first three rows in 
Table 4, sections A and B, are controls performed to establish a baseline to better 
evaluate the various treatments. These data show that supercoiled pUC19 
transforms E. coli 200-1000 times more efficiently than the tfmcll-restricted and 
dephosphorylated plasmid. Without this dephosphorylation step, the cloning 
efficiency is 10% that of the supercoiled molecule (data not presented). The 
background of blue colonies averaged 0.5% in these experiments, compared to 
supercoiled plasmid, while the background of white colonies averaged 0.005%. 

A comparison of the data from unfractionated versus fractionated 
samples in Table 4, sections A and B, reveals a general decline in the number of 
white and blue colonies obtained after sizing. This decrease is primarily due to 
the fact that cloning efficiencies are dependent upon the size of the fragment, 
favoring smaller fragments and thus giving higher efficiencies for the 
unfractionated material. This is illustrated by comparing the efficiency of cloning 
unfractionated and fractionated X DNA which was completely restricted with AM. 
This four base recognition endonuclease produces blunt ends and cuts X DNA 
(48,502 bp) at 143 sites. Only 25 of the resulting 144 fragments (17%) are larger 
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than 500 bp. The number of white colonies obtained when unfractionated X 
DNA, completely restricted with Alul, was cloned into pUC19 ranged from 250- 
400/ng vector, versus 23-48/ng vector for the fractionated material. This ten fold 
decrease was only noticed for the X Alu I digests, and probably reflects the large 
portion of small molecular weight fragments (approximately 75%) which is 
excluded from the fractionated ligation reactions. 

The number of white colonies obtained when unfractionated CvzJI** 
treated X DNA was cloned into pUC19 ranged from 160-340/ng vector, versus 68- 
90 white colonies/ng vector if the same material was fractionated. Unfractionated 
X DNA, completely digested with Alul, results in cloning efficiencies very similar 
to unfractionated CvilT* treated DNA. Sonicated X DNA is a poor substrate for 
ligation, compared to CwJI** treatment, as indicated by the roughly ten-fold 
reduced cloning efficiencies. 

Enzymatic repair of the ragged ends produced by sonication results 
in an increased cloning efficiency. Using conditions described in Example 13 for 
the first end repair treatment (ER 1), 10-44 (fractionated) and 19-32 
(unfractionated) white colonies/ng vector were observed. However, ER 1 
conditions may not be optimal, as an alternate end repair reaction (ER 2) (as 
described in Example 13) resulted in greater numbers of white colonies (63 and 
100/ng vector for fractionated and unfractionated DNA, respectively). In this 
reaction, a ten-fold excess of reagents and enzymes were utilized to repair the 
sonicated DNA, which apparently improved the efficiency of cloning such 
molecules by two to three fold. The data collected from multiple cloning trials 
in Table 3, sections A and B, show that CViJI** partial digestion results in three 
to sixteen times the number of white colonies than sonicated ER 1 -treated DNA. 
Even with an optimal end repair reaction for the sonicated fragments, DNA 
treated with CVUI yielded three times more white colonies. 
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Example 17 

Analysis of CviJI** Fragmentation for Shotgun Cloning and Sequencing 

The ability of CwJI** partial digestion to create uniformly 
representative clone libraries for DNA sequencing was tested on pUC19 DNA. 
pUC19 DNA was digested under CwJI** conditions and size fractionated as 
described above. The fractionated DNA was cloned into the £coRV site of 
M13SPSI, a lacZ minus vector constructed by adding an EcdRV restriction site 
to wild type M13 at position 5605. M13SPSI lacks a genetic cloning selection 
trait, therefore after ligation of the pUC19 fragments into the vector the sample 
was restricted with EcoRV to reduce the background of nonrecombinant plaques. 
Bacteriophage M13 plaques were picked at random and grown for 5-7 hours in 2 
ml of 2XTY broth containing 20 jd of a DH5aF' overnight culture. After 
centrifugation to remove the cells, single-stranded phage DNA was purified using 
Sephaglass"* as specified by the manufacturer (Pharmacia LKB, Piscataway New 
Jersey). The single-stranded DNA was sequenced by the dideoxy chain 
termination method using a radiolabeled M13-specific primer and Bst DNA 
polymerase (Mead et al, Biotechniques 11:76-87 (1991)). The first 100 bases of 
76 randomly chosen clones were sequenced to determine which CvOl recognition 
site was utilized, the orientation of each insert and how effectively the cloned 
fragments covered the entire molecule, as shown in Figure 11. The positions of 
the 45 normal CwJI sites (PuGCPy) in pUC19 are indicated beneath the line 
labeled "NORMAL" in the Figure 11. Similarly, the 160 CwJI* sites (GC) are 
indicated beneath the line labeled "RELAXED" in Figure 11. The marks above 
these lines indicate the CviJI** pUC19 sites which were found in the set of 76 
sequenced random clones. The frequency of cloning a particular site is indicated 
by the height of the line, and the left or right orientation of each clone is also 
indicated at the top of each mark. There are a total of 205 CviJI and CwJI* sites 
in pUC19. 



WO 94/21663 



PCT/US94/03246 



-70- 

The data presented in Figure 11 demonstrate that, under CViTI** 
partial conditions, normal CviJI sites are preferentially restricted over relaxed 
(CviJI*) sites. Of the 76 clones that were analyzed, only 13%, or 1 in 7, had 
sequence junctions corresponding to a relaxed CViJI* site. Thirty-five of the 
5 forty-five possible normal restriction sites were cloned, as compared to eight of 
the possible one hundred sixty relaxed sites. If the enzyme had exhibited no 
preference for normal or relaxed sites under the CwJI** partial conditions utilized 
here, then 78% of the sequence junctions analyzed should have been generated by 
cleavage at a relaxed CV/JI* site. It may be noted that the relaxed CwjT 

10 restriction sites that were found appear to be clustered in two regions of the 
plasmid that are deficient in normal CwJI sites. In addition, the combined 
distribution of the normal and relaxed sites which were restricted to generate the 
76 clones appears to be quasi-random. That is, the longest gap between cloned 
restriction sites was no greater than 250 bp and no one particular site is over- 

15 utilized. 

A detailed analysis of the distribution of CVJI** sequence junctions 
found from cloning pUC19 is presented in Table 5. 
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The GC sites in pUC19 may be divided into four classes based on 
their flanking Pu/Py structure. The fraction of GC sites observed in pUC19 which 
belong to each classification is roughly equal (22.0-27.8%). A striking difference 
was found between the observed distribution in pUC19 of normal and relaxed (Rl , 
R2, R3) CWJI recognition sites and the distribution revealed by shotgun cloning 
and sequence analysis of CvtJI**-treated DNA. While most of the sites cleaved 
by this treatment were found to be PuGCPy (about 87%), or "normal" restriction 
sites, a significant fraction of the cleavage occurred at PyGCPy (about 6.5 %) and 
PuGCPu (about 6.6%) sites, considering the short incubation times and limiting 
enzyme concentrations. The latter two categories of sites, and presumably the 
PyGCPu sites as well, are completely restricted under "relaxed" conditions, 
provided an excess of enzyme is present and sufficient time is allowed (see Figure 
7, and Xia et aL. Nucleic Acids Res. 15:6075-6090 (1987)). 

Digestion using CvtfT* treatment results in a relatively even 
15 distribution of breakage points across the length of the molecule (as shown in 
Figure 11). As described above, Figure 11 depicts a linear map of pUC19 
showing the relative position of the lacZ' gene (a peptide of jS-galactosidase gene) 
and ampicillin resistance gene (Amp). The marks extending beneath the top line 
(labeled "NORMAL") show the relative position of the 45 normal CwJI sites 
(PuGCPy) present in pUC19. The marks above the line are the cleavage sites 
found from sequencing the CWJI** partial library. The height of the line 
indicates the number of clones obtained from cleavage at that site, and the 
orientation of the flag designates the right or left orientation of the respective 
clone. The marks extending beneath the second line (labeled "RELAXED") show 
25 the relative positions of the 160 CwJI* sites (GC) present in pUC19. Those marks 
above the line were found from sequencing the CwJI** partial library. The 
bottom portion of Figure 11 shows the relative position and orientation of the first 
20 clones sequenced, assuming a 350 bp read per clone. CviH** cleavage at 
relaxed sites appears to be important in "filling gaps" left by normal restriction. 
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The primary goal of this effort was to determine the efficacy of 
these methods for rapid shotgun cloning and sequencing. For these purposes, 
only 100 bases of sequence data were acquired per clone. However, if 350 bases 
of sequence had been determined from each clone, then the entire sequence of 
5 pUC19 would have been assembled from the overlap of the first 20 clones (Figure 
11). In this sequencing simulation 75% of pUC19 would have been sequenced 
at least 2 times from the first 20 clones. The highest degree of overfold 
sequencing would have been 6, and only involved 2.2% of the DNA. Figure 11 
also shows that most of the lx sequencing coverage occurred in a region of the 

10 plasmid with a very low density of normal and relaxed CV/JI restriction sites. 
Most of the single coverage occurs in a 240 bp region of the plasmid between 
1490 bp and 1730 bp where there are only 4 CWJI relaxed sites. It should also 
be noted that by the 27th randomly picked clone most of this region would have 
been covered a second time. 

15 Shotgun sequencing strategies are efficient for accumulating the 

first 80-95% of the sequence data. However, the random nature of the method 
means that the rate at which new sequence is accumulated decreases as more 
clones are analyzed. In Figure 12 the total amount of unique pUC19 sequence 
accumulated was plotted as a function of the number of clones sequenced. The 

20 points represent a plot of the total amount of determined pUC19 sequence versus 
the total number of clones sequenced. The horizontal dashed line demarcates the 
2686 bp length of pUC19. The smooth curve represents a continuous plot of the 
discrete function S(N)=NLe- cs [((e cs -l)/c)+(l-s)]. The theoretical accumulation 
curve expected for a process in which sequence information is acquired in a 

25 totally random fashion is also shown. The smooth curve is a continuous plot of 
the discrete function S(N) where 

SW-NLe^ae^-D/c+d-a)]. 
This equation is based upon the results developed by Lander et al. , Genomics 
2:231-239 (1988) for the progress of contig generation in genetic mapping. In the 
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equation: N is the number of clones sequenced, L is the length of clone insert in 
bp, c is the redundancy of coverage or LN/G (where G is length of fragment 
being sequenced in bp), and a = 1-G, where G is the fraction of length that two 
clones must share. The curve in Figure 12 was calculated with G = 2686 bp, L 
« 350 bp, and c - 1. The plotted points lie close to the theoretical curve, and 
it thus appears that the sequence of pUC19 was accumulated in an apparent 
random fashion utilizing CV£jT* fragmentation and column fractionation. 

Example 18 

Shotgun Cloning Utilizing 200 ng of Lambda DNA 



Generally, 2-5 of DNA are needed for the sonication and 
agarose gel fractionation method of shotgun cloning in order to provide the 
several hundred colonies or plaques required for sequence analysis (Bankier et at. 
Methods in Enzymol. 155:51-93 (1987)). A ten-fold reduction in the amount of 
substrate required greatly simplifies the construction of such libraries, especially 
from large genomes, (Davidson, J. DNA Sequencing and Mapping 1:389-394 
(1991)). The efficiency of constructing a large shotgun library from nanogram 
amounts of substrate was tested utilizing 200 ng of CViJI**-digested lambda DNA. 
This material was column-fractionated as described previously. In this case, 1/2 
of the column eluant (15 yl containing 50 ng of DNA) was ligated to 100 ng of 
flrocll-digested and dephosphorylated pUC19 as described in Example 15. The 
cloning efficiencies of the control DNAs were similar to those reported in Tables 
2 and 3. The 50 ng cloning experiment yielded 230 white colonies per ligation 
reaction in one trial, and 410 white colonies per ligation reaction in a second trial. 
Thus, it should be possible to routinely construct useful quasi-random shotgun 
25 ubraries from as little as 0.2 - 0.5 M g of starting material. 
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Example 19 
Epitope Mapping 

CwJI* recognizes the sequence GC (except for PyGCPu) in the 
target DNA. Under partial restriction conditions the length of fragment may be 
controlled by incubation time. Epitope mapping using CviH** partial digests 
involves generating DNA fragments of 100-300 bp from a cDNA coding for the 
protein of interest, by methods described in Example 13, inserting them into an 
M13 expression vector, plating out on solid media, lifting plaques onto a 
membrane, screening for binding to the ligand of interest, and picking the positive 
plaques for isolation of the DNA, which is then sequenced to identify the epitope. 
Thus, the same epitope may be expressed as a small fragment or a larger 
fragment. This approach allows one to determine the smallest fragment 
containing the epitope of interest using functional assays such as binding to an 
antibody or other ligand, or using a direct assay for activity. For insertion into 
an M13 vector, linkers may be added to the fragments or the insert may be 
dephosphorylated to ensure that each fragment is cloned alone without ligation of 
multiple inserts. 

The expression vectors recommended for subcloning of the CvOl 
fragments are Lambda Zap™ (Stratagene, LaJolla, California) or bacteriophage 
M13-epitope display vectors. An advantage of using an M13-based vector is that 
the peptide or protein of interest may be displayed along with the M13 coat 
protein and does not require host cell lysis in order to analyze the protein of 
interest. The lambda-based vectors yield plaques and hence the protein can be 
directly bound to a membrane filter. 
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Example 20 
CGasel 

CGase I as used herein, refers to a restriction endonuclease reagent which 
cleaves DNA at the dinucleotide CG. CGase I activity is based on the combined 
star activities of the restriction endonucleases Hpa D and Taq I. Under normal 
reaction conditions (10 mM Bis Tris Propane-HCl pH 7.0, 10 mM MgCl 2 , 1 mM 
DTT; 1 unit of enzyraeZ/tg DNA, 37°C for 1 hr), Hpa H recognizes CCGG and 
cleaves after the first C to leave a 2-base 5' overhang. Under normal reaction 
conditions (100 mM NaCl, 10 mM Tris-HCl pH 8.4, 10 mM MgCl 2 , 10 mM 2- 
mercaptoethanol, 1 unit of enzyme/ M g DNA, 65°C for 1 hr) the restriction 
endonuclease Taq I recognizes TCGA and cleaves after the T to leave a 2-base 
5' overhang. 

Reaction conditions have been described for Taq I* activity which decrease 
the cleavage specificity of Taq I (10 mM Tris-HCl pH 9.0, 5 mM MgCl 2 , 6 mM 
2-mercaptoethanol, 20% DMSO; 2000 units of enzyme/^ DNA, 65°C for 1 hr) r 
(Barany, Gene, 65:149-165 (1988)). These reaction conditions allow Taq I* to ' 
cleave DNA at the following sequences: 

Taq I* TCGA 
CCGA (TCGG) 
20 ACGA (TCGT) 

TCTA (TAGA) 
TCAA (TTGA) 
GCGA (TCGC) 



15 



25 



We are unaware of any literature descriptions of Hpa H* conditions. 
However, the following conditions were established to promote Hpa II* activity 
which are also compatible with Taq I* activity: 5 mM KC1, 10 mM Tris-HCl pH 
8.5, 10 mM MgCl 2 , 1 mM DTT, 15% DMSO, 100 ug/ml BSA (CGase buffer); 
50 units of enzyme/„g DNA 50°C for 1 hr. The Hpa U* recognition sites were 
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determined by cloning and sequencing Hpa II* restricted fragments. The 
characterized Hpa n* recognition sequences are as follows: 



Hpa II* CCGG 
5 CCGC (GCGG) 

CCGA (TCGG) 
ACGG (CCGT) 

Taq I (400 units/ M g DNA) and Hpa II (50 units/^g DNA) were then 
combined (CGase I) in CGase I buffer and the following recognition sites were 
10 identified by cloning and sequencing restricted pUC19 fragments. 

CGase I GCGC 
TCGA 
CCGG 
GCGT 

15 ACGA 

ACGG (CCGT) 
GCGG (CCGC) 
CCGA (TCGG) 

CGase I restriction of natural DNA, (i.e. pUC19, lambda), results in fragments 
20 ranging from 20-200 bp in length (average 20-60 bp). Heat denaturation of these 
fragments generates numerous oligonucleotides of variable length but precise 
specificity for the cognate template as was the case with CvO I* digestion. CGase 
I restriction of the small plasmid pUC19 (2689 bp) theoretically yields 174 
restriction fragments, or 384 oligonucleotides after a heat denaturation step. 

The "two-cutter" activity of CviJ I* and CGase I represent a unique class 
of restriction endonuclease activity in that no other known restriction 
endonucleases will generate this size range of oligonucleotides. The ability to 
generate numerous oligonucleotides with perfect sequence specificity from any 
DNA, without regard to sequence composition, genetic origin, or prior sequence 
30 knowledge is one of the properties that CGase I shares with CviJ I*. In addition, 
the generation of numerous oligonucleotides by CviJ I or CGase I results in a 
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form of probe or primer amplification not practical using conventional means of 
organic synthesis. 

Based on ability to recognize a dinucleotide sequence, the present invention 
contemplates the interchangeability of CGase I with CviJ I* in all of the 
applications described herein. 



Example 21 

Purification of CviJ I Restriction Endonuclease from 
IL-3 A-Infected Chlorella Cells 



10 



CviJ I was prepared by a modification of the method described by 
Xia et of.. Nucl. Acids Res. 15:6025-6090 (1987). (Manila NC64A cells 
(ATCC Accession No. 75399 deposited on January 21, 1993, American Type 
Culture Collection, Rockville, Maryland) were infected with the virus IL-3A 
(ATCC Accession No. 75354 deposited November 6, 1992, American Type 
Culture Collection, Rockville, Maryland) according to Van Etten et at., Virology 
15 126: 1 17-125 (1983). Five grams of IL-3A infected ChloreUa NC64A cells were 
suspended in a glass homogenization flask with 15 g of 0.3 mm glass beads in 
buffer A (10 mM Tris-HCl pH 7.9, 10 mM 2-mercaptoethanol, 50 M g/ml 
phenylmethylsulfonyl fluoride (PMSF), 20 ug/ml benzamidine, 2 M g/ml o- 
phenanthroline). Cell lysis was carried out at 4000 rpm for 90 sec in a Braun 
MSK mechanical homogenizer (Allentown, PA) with cooling from a CO2 tank. 
After lysis 2 M NaCl was added to a final concentration of 200 mM, after which 
10% polyethyleneimine (PET) (Life Technologies, Bethesda, MD) (pH 7.5) was 
added to a final concentration of 0.3%. The mixture was then stirred for 2 hrs. 
at 4°C then centrifuged for 1 nr. at 50,000 g. Ammonium sulfate was added to 
25 the supernatant to 70% saturation and stirred overnight. A protein pellet was 
recovered by centrifugation for 1 nr. at 50,000 g. The resulting pellet was 
dissolved in 20 ml of buffer B (20 mM Tris-acetate pH 7.5, 0.5 mM EDTA, 10 
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mM 2-mercaptoethanol, 10% glycerol, 30 mM KC1, 50 ug/ml PMSF, 20 ,xg/ml 
benzamidine [Sigma, St. Louis, Missouri], 2 M g/ml o-phenanthroline [Sigma]) and 
dialysed against 500 ml of buffer B with 3 changes. The dialysed solution was 
then applied to 1 x 6 cm Heparin-Sepharose (Pharmacia LKB, Piscataway, New 
Jersey) column. After a 50 ml wash with buffer B, a 100 ml gradient of 0 to 0.7 
M KC1 in buffer B was run. Fractions having CviJ I activity as measured by 
digestion of pUC19 DNA and agarose gel electrophoresis, were pooled, diluted 
in 5 volumes of buffer C (10 mM K/P04 pH 7.4, 0.5 mM EDTA, 10 mM 2- 
mercaptoethanol, 75 mM NaCl,0.05% Triton X-100, 10% glycerol, 50 /ig/ml 
PMSF, 20 Mg/ml benzamidine, 2 /tg/ml o-phenanthroline) and applied to a 1 x 7 
cm Phosphocellulose Pll (Whatman) column equilibrated in buffer C. After 
washing with 30 ml of buffer C, CviJ I was eluted by a 100 ml gradient of 0 to 
0.7 M NaCl in buffer C. At this step CviJ I activity separated from non-specific 
nucleases. CviJ I containing fractions were pooled and diluted in 4 volumes of 
buffer C and applied to a 1 x 4 cm hydroxyapatite HTP column (BioRad, 
Hercules, CA). After washing with 30 ml of buffer C, CviJ I was eluted by a 0 
to 0.7 M potasium phosphate <pH 7.4) gradient in buffer C. Active fractions 
containing CviJ I activity and lacking non-specific nuclease activity were pooled 
and were dialysed overnight against storage buffer (50 mM potassium phosphate 
200 mM KC1, 0.5 mM EDTA, 5096 glycerol, 20 ug/ml PMSF were pooled) and 
stored at -20°C. 

Although the present invention has been described in types of 
preferred embodiments, it is intended that the present invention encompass all 
modifications and variations which occur to those skilled in the art upon 
consideration of the disclosure herein, and in particular those embodiments which 
are within the broadest proper interpretation of the claims and their requirements. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION : 

(i) APPLICANT: Molecular Biology Resources, Inc. 

(ii) TITLE OF INVENTION: Materials and Methods for 

Restriction Endonuclease Applications 
(iii) NUMBER OF SEQUENCES: 13 

(iv) CORRESPONDENCE ADDRESS: 

£! ^SS^ SE 5L!J ar8ha11 ' °' To °le, Cerstein, Murray & Borun 
ln\ ™ ET * f 300 Sears Tower ' 233 South Waclcer Drive 

(C) CITY: Chicago 

(D) STATE: Illinois 

(E) COUNTRY: United States of America 
<F) ZIP: 60606-6402 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY / AGENT INFORMATION : 
<A) NAME: Clough, David W. 

(B) REGISTRATION NUMBER: 36,107 

(C) REFERENCE /DOCKET NUMBER: 28003/31967/PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 312/474-6300 

(B) TELEFAX: 312/474-0448 

(C) TELEX: 25-3856 

(2) INFORMATION FOR SEQ ID NOil* 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOrl: 
CAATTTCACA CAGGAAACAG CTATGTCTTT TCGCACGTTA GAAC 44 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5496 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
ATGTCTTTTC GCACGTTAGA ACTATTCCCC GGTATAGCTG CTATTTCACA TGGCCTCAGA 
GCTATATCTA CACCAGTTGC ATTCGTAGAA ATTAATGAAG ACGCACAAAA ATTCTTGAAA 
ACAAAGTTTT CAGATGCATC TCTATTCAAT CACGTTACGA AATTTACCAA ATCGGACTTC 
CCAGAAGACA TAGACATGAT TACTGCGGGA TTCCCGTGCA CTGCGTTTAG TATTGCAGGT 
TCTAGAACTC GATTCGAACA CAAGGAATCC GCTCTCTTTG CTGATGTTGT GCGAATCACG 
GAAGAGTATA AACCTAAAAT AGTGTTTTTG GAAAACTCCC ATATGTTGTC CCACACTTAC 
AATCTCGATG TCGTCGTAAA AAAGATGGAT GAAATTGGTT ATTTCTGCAA GTGGGTAACT 
TGTCGGGCAT CAATTATAGG AGCCCATCAT CAACGCCACC GCTGGTTTTC TCTCGCCATT 
CGAAAAGATT ATGAACCAGA AGAAATAATT GTATCTGTGA ATGCTACAAA GTTOGACTGG 
GAAAATAATG AACCACCGTC TCAAGTAGAC AATAAGAGTT ACCAGAATTC AACTCTTGTT 
CGTCTGGCAG GATATTCCGT GGTCCCCGAC CAGATCAGAT ATGCTTTCAC CGGTCTATTT 
ACAGCTGATT TTGAGTCATC GTGCAAAACT ACCTTGACAC CTGGGACAAT AATTGGCACG 
GAACACAAAA AAATGAAAGG AACTTACGAT AAAGTCATAA ACGGGTATTA TGAGAACCAT 
GTGTATTATT CTTTTTCAAC GAAAGAAGTT CATCGCCCTC CTCTAAATAT ATCCGTGAAA 
CCACGTCATA TTCCGGAGAA ACATAACGGA AAAACACTCG TAGATCGCGA AATGATCAAG 
AAATATTGGT GCACACCATG TGCTAGTTAT GGCACTGCTA CTGCTGGATG CAATGTTCTG 
ACAGACCGTC AGTCACATGC ACTTCCTACA CAAGTCAGGT TTTCATATAG GGGTGTATGT 
GGACGACATT TGTCTGGTAT ATGCTCTGCA TGGTTGATGG GGTATGACCA ACAATATCTT 
GGTTATTTGG TTCAATATGA TTAAAATATT TTGATACACT AAATGGATAT AACAAGAAAA 
CCTTTTACAA TACAAGGCGC TAAACGTATA ATACTCGAAA AAAAGACACT TGAAGAGAAA 
AAAAGAATTG CGGAAGAGAA AAAAAGAATT GCACTTATAG AAAAACAACG AATTGCGGAA 
GAGAAAAAAA GAATTGCGGA AGAGAAAAAA CGATTCGCAC TTGAAGAGAA AAAACGAATT 
CCGGAAGAAA AAAAACGAAT CGCGGAAGAG AAAAAACGAA TCGTGGAACA CAAAAAAAGA 
CTTGCACTTA TAGAAAAACA ACGAATTGCG GAAGAGAAAA TTGCCTCGGG GAGAAAAATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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AGAAAGAGGA 
TCAATCTTCC 
GAAATCCACA 
GCGTATGTTG 
GGTAAAGATG 
ATTTCTGGAA 
AGAAAAGTAG 
TCTCCTATCA 
AAACCAGGAA 
GGTTCCATAT 
TTTACTGGGA 
AGTATAAGAA 
TACAACTTTG 
TGTTGACCGC 
TGCAAACCTT 
TGTATAAACG 
TTAGTTGTAT 
CCTTCGAAAC 
AAAACGATAT 
CGTTTTTCTT 
AGTTCATTCG 
ATTGCTAACA 
CACGTGTAGT 
TTATTAATCG 
ATGTTCTGCA 
TCATTAGAGA 
TCAACAAAAT 
CATTTGCAAT 
GCTCATTGAA 
ACTACATATT 
TGCAAACCTT 



TCTCTACAAA 
TCGGACCCCC 
ACGTTGTAAG 
ATAGAGAATA 
TGGCATGGAT 
AGAACCTCAA 
TTAGTATGGC 
AGTCAAATTT 
GGGACAATGT 
TATATCTTAC 
AACATGAACC 
CTGTCGTCAA 
TTTCTTCAAA 
GTACTAAAAA 
GACATCGTCA 
GTAAATACCT 
ACTACTTTTG 
AAGCAATTAC 
TTCCTACAGA 
TACCGTATTT 
GCAATTGTGC 
CTATCGGTAA 
TGTCGTCTAT 
GATCTGATCC 
CACGAACAAC 
CTTGCGAGTA 
ACATATAAAC 
AGTATATTCA 
AGAGGTAGCA 
CCAGCTCATC 
GATATCACAG 



TGCAACAAAA 
TACTTTTGTA 
ATTCAGACAA 
TAACAAACCT 
ATCCCATAAA 
GTTCACAGGA 
ACCGGTATCT 
GATTAAAAAT 
AGACATCATA 
ATTCACTGGT 
CGTTTTCTAT 
TGGTGTCACT 
AACACAACGT 
ATGGTGACGA 
GTGTTGACTA 
ATATATACAA 
TATAAGACCT 
CGCATGAGAA 
AGTTTCTATG 
TACTTTCGTG 
CGTGACACCA 
TCCATGTGTG 
ATCATATAAC 
ATAAGAAGAA 
ATTCGTCAAA 
TATAACATTA 
ACCATACAAA 
CTGCAGTAAA 
ATATCAATGA 
ATGCTAATGA 
ATATTTCTGG 



CATGAAAGAG AATTTGTCAA AGTTATAAAT 1500 

TTCGTAGATA TAAAAGCTAA TAAATCCAGA 1560 

TTACAAGGCA CTAAAGCGAA ATCCCCGACC 1620 

AAAGCGGATA TAGCAGCGGT AGACATAACC 1680 

GCATCTGAAG CATATCAACA ATATCTAAAA 1740 

AAAGAATTAG AAGAAGTTCT ATCGTTCAAG 1800 

AAAATATGGC CTGCTAATAA GACCGTATGG 1860 

CAAGCAATAT TCGGATTTGA TTACGGTAAG 1920 

GGTCAAGGAC GACCAATTAT AACAAAAAGA 1980 

TTTAGCCCAT TAAATCCCCA CTTGGAGAAT 2040 

GTAAGAACAG AACGGAGTAC TAGCGGGAGA 2100 

TATAAAAATT TAAGATTCTT TATACATCCA 2160 

ATTATGTAGG ACCATTTTCC CGAGAGACTT 2220 

TATTTGTCTA AAGATGCTCA TAGAAGCAGG 2280 

TACACCATTA CATCTACATC TGGTGATATT 2340 

TACGTATCCC CCTAAAAGCG CTTAGATTTT 2400 

GTAAGTTACA AACTAAAAGT TTCAG C T T TG * 2460 

TAATATCCAT TATGGATGTT TTCTGCTAAT 2520 

ATTAGTTCCG AAATATTGAG ATCATCCTCA 2580 

ATCGTCCCAC CAATAAAATC ATCTCGTGTC 2640 

AATCTCTCAC AACAACCTTG ATGTCCATCC 2700 

GTGTGTACGA CCACACCGTT ATAACTATAA 2760 

TCGAGAGCGG TGTCAACTTC TTCAGATCTA 2820 

TCTTCATATT TACAAATAAA ATCATCCGAT 2880 

TTTCTGTGAT GACGAATCTC CATCTCTGAA 2940 

TAATTCTTGA TATGATTATT ACGTTTCATA 3000 

TATTAAAACA CGTTAGTATA TAATGGATAA 3060 

AAATGGCCAC GAAGCTTGTT TGAAGATGAT 3120 

TGTTTCCGAA TCAAAATATG GAAATACACC 3180 

TGTCTGTTTG AAGATGCTTA TTGACGCAGG 3240 

AGGAACACCA CTTCATCGTG CGGTTTTGAA 3300 
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TCGCCATGAC ATATTGTACA GATGCTCGTA GAAGCAGGTG CAAACCTTAG TATCATAACT 3360 
AATTTGGGAT GGATACCGTT ACATTACGCG GCTTTTAATG CTAATGATGC GATTTTGAGG 3420 
ATGCTCATCG TTGTAAGTGA TAATCTTGAC GTTATCAATG ATCGCGGTTG GACGGCGTTA 3480 
CATTACGCGG CTTTTAATGG TCATAGCATG TGCGTCAAGA CGCTTATTGA TGCGGGTGCA 3540 
AATCTTGACA TCACAGATAT TTCGCGATGT ACACCACTTC ATCGTGCCCT TTATAATGAC 3600 
CACGATGCAT GTGTGAAGAT ACTCGTAGAA GCAGGTGCAA CTCTTGACGT CATTGATGAT 3660 
ACTOAGTGCG TGCCGTTACA TTACGCGGCT TTTAATGGTA ATGATGCGAT TTTGAGCATG 3720 
CTCATTGAAG CAGGTGCAGA TATTGATATA TCTAATATAT CTGATTGGAC GGCGTTACAT 3780 
TACGCGGCTC GAAATGGACA CGATGTGTGT ATAAAAACAC TCATCGAAGC AGGTGGTAAC 3840 
ATCAACGCCG TCAACAAATC CCCCCATACA CCACTAGATA TTGCAGCATG TCATGACATT 3900 
GCAGTATGTG TGATCGTGAT AGTCAATAAG ATCGTTTCGG AGCGGCCGTT GCGTCCGAGT 3960 
GAGTTGTGTQ TCATACCACC AACGTCTGCT GCATTAGGTG ATGTGTTGCG AACGACGATC 4020 
CGGCTTCATG GGCGATCGGA AGCTGCAAAG ATCACACCGC ATCTTCCTGT CGGTGCAAGG 4080 
GATACTCTAC GAACTACTGC GTTGTGTTTG AACCGAACAA TTTCCCAGAG ATCTCGTTGA 4140 
TAGTGTATTA ATTGAATGCG TCTAAAGTTA CGCTATTTTT TTCCAAAAAG CGTTTGCATG 4200 
AAATACAACA CGATCTTTTG TAGATCGTTT ACCATTAGTT GTATTCGTGC AATAGACACC 
ATACCTACCT CCAAATTCAT TTACTTTACC TACAGTATTA CCACTTCCTT TTTTTCCTAT 
AGTAGTATCT AAATTCAACC CTTTGAACTC ATCCCCATTA ACAGACAGAG CGTATGAACC 
GTTTTGTGCC AATTTCACCT TCAAAACGAT AGTAACCCAT TGACCTCTAG GAATTTTAAC 
CGATCTTATA AGTATCTGCT TACTTCCAAG TCCTTTTTCA AAAGCATACA ACGATCCTCT 
AAGGTTATCC CCAGAACCTG AAATTGTAAA GAACGACTGG AAATGAATAG GTTGCATTAG 
ATCTGTATAC ATATCACTTG CTTCGAAATG AAAATCCTAG TCCCAATTAG GTACGTTCCA 
CCAAGTTTAA TACGGGGTCT TTCCACCCAG ACCGGACATT TCAGCACGAG CCTTGTAAGA 
ATGATATGAT GTGGTTAAAT CTCTATCACC ATCGTTCCAC TTTCCTCTGA ACCGAAGACC 
ATGCATCGTT ATACCTGGTG CAACCTGTAC TAAATTCTTT ATTTCACGTG CCCCTCCGCC 
TGGATTAACT CGAGATTCGT CAAATCTAAA ATATGATAAC GATGTTCCAA CAGTAGAACC 
ACTGGGTGGT ATGGCAGTTG CTGGAAGCGA AGGTAAAACT TTAGGATATT TCAAATCACC 
AACACCTTGA GGGTTTACTT GAATACTTCT GGGAGATGTT GCTGGTTTCG TCGAAGCTGC 
TTTCGTTGAA GGTCGTTTCG TCGAAGCTGC TTTCGTCGAA GCTGGTTTCG TCGAAGCTGC 
TTTCCTCGAA CGTGGTTTCC TCGAAGCTGC TTTCGTCGAA GCTGGTTTCG TCGAAGCTGC 
TTTCGTCGAA GGTCGTTTCG TCGAAGCTGC TTTCGTCGAA GCTGGTTTCG TCGAAGCTGC 



4260 
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4620 
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* Sir 1*1 tT A ^ A AGA "* CGT »* ACA ATA CAA GCG OCT AAA CGT 
Met A.p He Arg Arg Lye Atg Phe Thr lie Clu ci? Ail JJi £J 
15 20 25 

ATA ATA CTC CAA AAA AAG AGA CTT CAA GAG AAA AAA AGA ATT CCC en 
He lie Leu Clu Lye Lye Arg Leu Glu Glu Lye iyt i£J III ^ 

35 40 

GAG AAA AAA AGA ATT CCA CTT ATA CAA AAA CAA CGA ATT cee r»» 
Glu Ly. Lye Arg He Ala Leu lie Clu Lye Si 2J i£ J£ £J XJ 
45 50 5S 

85 90 

s si s: £ s s 2 5 s s: s is s 5 s s 

U 115 120 



5220 
5280 
5340 
5400 
5460 
5496 
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TTTCCTCGAA GGTGCTTTCG TCGAAGCTCG TTTCCTTGGC GGAAGTGCGG CATGACCATA 

ATCCGTTAAA TTCCCCCATT CACCTAATCA TGTACTCCAT AAAGAACCGG C7GCCCATTC 

CATTCTTATT GGTTCTGTAG TATCACATAT ACATACGAAA TAATGAGAAT CATTTTCCCT 

GCCAAATAAT TTACCAGATT TGCCTTTACA TGACATTATT TCTAATATAA TATTATTATA 

ATTTTAAAAA AACTAACGTC TATTTAAAAT TATGTAATAC GTATTATATC AATCCATCAT 

CTTAATCATT TCCTAACGTA TAAGCGTAGC GAATTC 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1225 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join<1..33, 55. .1128) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3s 



99 



147 



195 



243 



291 



339 



387 
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ACA AAT CCA ACA AAA CAT GAA AGA GAA TTT CTC AAA GTT ATA AAT TCA 43S 
Thr Asn Ala Thr Lye Hie Glu Arg Glu Phe Val Lye Val lie Asn Ser 
125 130 135 

ATG TTC GTC GGA CCC GCT ACT TTT GTA TTC GTA GAT ATA AAA GGT AAT 483 

Met Val Gly Pro Ala Thr Phe Val Phe Val Asp lie Lye Gly Asn 
140 145 150 

AAA TCC AGA GAA ATC CAC AAC GTT GTA AGA TTC AGA CAA TTA CAA GGC S31 
lye Ser Arg Glu lie His Asn Val Val Arg Phe Arg Gin Leu Gin Gly 
i5S 160 165 



330 



GAA CCC ACT ACT AGC GGC AGA AGT ATA ACA ACT GTC GTC AAT GGT GTC 
Glu Arg Ser Ser Ser Gly Ar, Ser lie Thr Thr Val Val Asn Gly S£ 
335 340 345 

ACT TAT AAA AAT TTA AGA TTC TTT ATA CAT CCA TAC AAC TTT GTT TCT 
Thr Tyr Lye Asn Leu Arg Phe Phe He His Pro Tyr Asn rhe Sal Ser 
"° 355 36Q 



579 



170 

ACT AAA GCG AAA TCC CCG ACC GCG TAT GTT GAT AGA GAA TAT AAC AAA 
Ser Lye Ale Lye Ser Pro Thr Ala Tyr Val Aep Arg Glu Tyr Aen Lye 
175 180 185 

CCT AAA GCG GAT ATA GCA GCG GTA GAC ATA ACC GGT AAA GAT GTG CCA 627 
Pro Lye Ala Aep lie Ala Ala Val Ab P He Thr Gly Ly» Aep Val a2 
190 195 200 

Jro !2 J5 SI Sf* 7* 5** °? A TAT <** «» TAT CTA AAA ATT 

Trp lie Ser Hxe Lye Ala Ser Glu Gly Tyr Gin Gin Tyr Leu Lys He 
Z05 210 215 

TCT GGA AAC AAC CTC AAG TTC ACA GCA AAA GAA TTA CAA GAA GTT CTA 
Ser Gly Lys Asn Leu Lys Phe Thr Gly Lys Glu Leu Glu Glu Val Leu 
" u 22S 230 

TCG TTC AAG AGA AAA CTA GTT ACT ATC CCA CCC CTA TCT AAA ATA TOO 
Ser Phe Lye Arg Lys Val Val Ser Met Ala Pro Val Ser LV- S Trp 
235 240 245 250 

CCT GCT AAT AAC ACC GTA TCG TCT CCT ATC AAC TCA AAT TTC ATT AAA 
Pro Ala Asn Lys Thr Val Trp Ser Pro He Lys Ser Asn Leu I ie JJe 
255 260 265 

AAT CAA GCA ATA TTC GGA TTT GAT TAC CCT AAG AAA CCA GGA AGC GAC 
Aen Gin Ala lie Phe Gly Phe Asp Tyr Gly Lys Lys Pro GlJ Sp 
270 275 280 

Jin vll A« lie I1 A £1 ^ A CGA CCA ATT ATA *CA AAA AGA GGT 

Asn val Asp lie He Gly Gin Gly Arg Pro He He Thr Lys Arg Gly 
285 290 295 

Sr tT A r T I A lit 7 f T ^ TTC ACT 001 A GC GCA TTA AAT GGG CAC 

loo Tyr Thr ^ e Thr C1 * Phe Ser Ala Le « Asn Gly His 
* w 305 310 

TTC GAG AAT TTT ACT GGG AAA CAT GAA CCC GTT TTC TAT GTA AGA ACA lOll 
Leu Glu Asn Phe Thr Gly Lys His Glu Pro Val Phe Tyr Val Arg iSr 

320 325 *" - " " 



675 



723 



771 



819 



867 



915 



963 



1059 



1107 
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2 SI S S£ S S SSI TACGACCATT CTCCCGACAG ACTTTGTTGA 1158 
365 

CCGCGTACTA AAAAATGGTC ACGATATTTG TCTAAAGATG CTCATAGAAG CAGGTGCAAA 1218 
CCTTGAC 

1225 

(2) INFORMATION FOR SEQ ID NO: 4s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 369 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION s SEQ ID NO: 4: 

Gin Glu Tyr Leu Gly Tyr Leu Val Gin Tyr Aap Met Asp He Arg Arg 
5 10 is 

Lye Arg Phe Thr He Glu Gly Ala Lys Arg He He Leu Glu Lye Lye 
20 25 30 

Arg Leu Glu Glu Lys Lya Arg He Ala Glu Glu Lya Lye Arg He Ala 
" 40 45 

Leu lie Glu Lye Gin Arg lie Ala Glu Glu Lys Lya Arg He Ala Glu 
50 55 60 

Glu Lya Lya Arg Phe Ala Leu Glu Glu Lye Lys Arg He Ala Glu Glu 
65 70 75 80 

Lye Lys Arg He Ala Glu Glu Lya Lye Arg He Val Glu Glu Lye Lye 
85 go 95 

Arg Leu Ala Leu He Glu Lye Gin Arg He Ala Glu Glu Lye He Ala 
100 105 iio 

Ser Gly Arg Lye He Arg Lye Arg He Ser Thr Aen Ala Thr Lye Hie 
- li5 120 125 

Glu Arg Glu Phe Val Lye Val He Aen Ser Met Phe Val Gly Pro Ala 
J " au 135 140 

Thr Phe Val Phe Val Aep He Lye Gly Aen Lye Ser Arg Glu He Hie 
i4S 150 155 160 

Aen val Val Arg Phe Arg Gin Leu Cln Gly Ser Lye Ala Lye Ser Pro 
165 170 175 

Thr Ala Tyr Val Aep Arg Glu Tyr Aen Lye Pro Lye Ala Aep He Ala 
i8 ° 185 190 

Ala val Aep He Thr Gly Lye Aep Val Ala Trp He Ser Hie Lye Ala 
195 200 205 

Ser Glu Gly Tyr Cln Gin Tyr Leu Lye He Ser Gly Lye Aen Leu Lye 
<iu 215 220 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EONESS : single 
<D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GTAAAACGAC GGCCAGT 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCCAAGCTTG GATGAT 



16 
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(2) INFORMATION FOR SEQ ID NO: 7: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairo 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : a ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ATCTTCGCGA ATTCACTGGC CGTCGTTTTA C 31 
(2) INFORMATION FOR SEQ ID NO: 8: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSx single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GAATTCGCGA AGAT 

14 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 
ATCATCCAAG CTTGGCACTG GCCGTCGTTT TAC ?1 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GTAAAACGAC GCCCACTGAA TTCGCGAAGA TNNNNNNNNN NNNNNNNNAT CATCCAAGCT 60 
TGGCACTGGC CGTCGTTTTA C 

81 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTAAAACGAC GGCCACTGCC AAG CTTGG AT GATNNNNNNN NNNNNNNNNN ATCTTCGCGA 60 
ATTCACTGGC CGTCGTTTTA C Q1 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: join(26. .148, 190.. 207. 244.. 270) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TAACAATTTC ACACAGGAAA CAGCT ATG ACC ATG ATT ACG CCA AGC TCG AAA 52 

Met Thr Met He Thr Pro Ser Ser Lvs 
IS 

TTA ACC CTC ACT AAA GGG AAC AAA AGC TGG TAC CGG GGC CCC CCC TCG 100 
Leu Thr Leu Thr Lye Gly Asn Lye Ser Trp Tyr Arg Gly Pro Pro Ser 
10 15 20 25 

AGG TCG ACG GTA TCG ATA AGC TTG ATA AAC CAT TTA TAC AAT AAG CGT 148 
Arg Ser Thr Val Ser He Ser Leu He Asn His Leu Tyr Asn Lys Aro 
30 35 40 

TGATATAAGT TTGTATATAC GTCATTTCGT TATATCAACA A ATG TTA TCA TAT 201 

Met Leu Ser Tyr 
45 

TAT ACG TAAAACTGGC TTAAAAAAAA ACGAGGTGTA ACTATA ATG TCT TTT CGC 255 
Tyr Thr Met ser Phe Arg 

SO 



ACG TTA GAA CTA TTT 
Thr Leu Glu Leu Phe 
55 



270 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Thr Met He Thr Pro Ser Ser Lye Leu Thr Leu Thr Lys Gly Ann 
1 5 10 15 

Lya Ser Trp Tyr Arg Gly Pro Pro Ser Arg Ser Thr Val Ser He Ser 
20 25 30 

Leu He Asn Hie Leu Tyr Asn Lye Arg Met Leu Ser Tyr Tyr Thr Met 



45 



Ser Phe Arg Thr Leu Glu Leu Phe 
50 55 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCT Role \3bis) 



A. The indications made below relate to the microorganism referred to in the description 

79 .line 13 



on page 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are identified on an additional 



sheet 0 



Name of depositary institution 

American Type Culture Collection 



Address of depositary institution (including postal coda and country) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA 



Date of deposit 



November 6, 1992 



Accession Number 

A.T.C.C. 75354 



C. ADDITIONAL INDICATIONS <Um« blank if not opphembk) This information is continued on an additional sheet Q 



"In respect of thoae designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC). lf 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if** 



mm major *U iengnoicd States) 



E. SEPARATE FURNISHING OF INDICATIONS ftan* blank if mot applicable) 



The indications listed below wilt be 
Numbar of Dopant 9 ) 



submitted to toe international Bureau later {specify the general nature oftheutfUsuomj eg* 'Accession 



For receiving Office use only — — — 
\Jfi Thi* sbeet was received with u^ejnternationa) application 

^CUtnonzed officer / ^ 



For International Bureau use only 



PI This sheet was received by the international Bureau 



on: 



Authorized officer 



Form PCT/RO/134 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRuic Ubis) 



A. The indications made below relate to the microorganism referred to in me description 
on page 79 .line J£ 



B. IDENTIFICATION OF DEPOSIT 


Funher deposits are identified on an additional ibeel f"x] 


Name of depositary institution 

American Type Culture Collection 


Address of depositary institution (including pasta/ cod* **4 country) 




12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA 




Date of deposit January 21, 1993 


Accession Number 

A.T.C.C. 75399 


C. ADDITIONAL INDICATIONS (le**bU*kifmaiipplkmbl 


k) This information is continued on an additional sheet ^] 





"In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC)." 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE Gftke indicium* utuafoteU desipuutd Siaia) 



E. SEPARATE FURNISHING OF INDICATIONS (icm* blent if met ipplktbU) 



The indications listed below will be submitted to the international Bureau la ter (specify ike few*/ meturt of Out U*jc 
Number of Dtpasil 0 ) 



rei, 'Accession 



pft This sheet 



For receiving Office use only 



ived with the internatio nal applicatio n 




For International Bureau use only 



("""] This sheet was received by the International Bureau on: 



Authorized officer 



Form PCT/RQ/l34(Juty 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCT Rule I3bis) 



I A. The indications made below relate to the microorganism referred to in the description 
°«P*g« 31 j ine 25 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are identified on an additional sheet ET 



Name of depositary institution 



American Type Culture Collection 



Address of depositary institution i including pouil code cowry) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA 



Date of deposit 



June 30, 1994 



Accession Number 

A.T.C.C. 69341 



C ADDITIONAL INDICATIONS (Um*bU*k if** tpplicabt* This information is 



on an additional sheet Q 



"In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC)." 



P. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (ifike^^^for.Ute^u*** 



SEPARATE FURNISHING OF INDICATIONS ilea* bl**k if*m tppkc+bU) 



Number ofDcponi*) 



1 to the International Bureau 



later Upcafy d*tauTol*dturtofUtetMAcuto«i ex. 



For receiving Office use only 



Jj^f sheet was recejvedwith the international a 



officer 





Fona PCT/RO/134 (July 1992) 

>JSDOClD: <WO 9421663A1J_> 



For International Bureau use only 



Q Thu sheet was received by the International Bureau 



on: 



Authorized officer 
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WE CLAIM: 

1. A purified and isolated polynucleotide encoding a CwJI 
polypeptide or a variant thereof possessing activity characteristic of CwJI, said 
polynucleotide comprising a polynucleotide as set out in SEQ ID NO: 2. 

2. The polynucleotide of claim 1 which is a DNA. 

3. The DNA of claim 2 which is a viral genomic DNA 
sequence or a biological replica thereof. 

4. The DNA of claim 2 which is a wholly or partially 
chemically synthesized DNA or biological replica thereof. 

5. A purified isolated DNA encoding a polypeptide according 
to claim 1 by means of degenerate codons. 

6. A vector comprising a DNA according to claim 2. 

7. The vector of claim 6 which is the plasmid pCJH 1 .4 ( ATCC 
Accession No. 69341). 

8. A host cell stably transformed or transfected with a DNA 
according to claim 2 in a manner allowing the expression in said host cell of a 
CvOl polypeptide or a variant thereof possessing a sequence specificity 
characteristic of CViJI. 

9. The host cell according to claim 8, wherein said host cell 

is £. coli. 
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10. A method for producing a CViJI polypeptide or a variant 
thereof possessing biological activity specific to CV/JI, said method comprising the 
steps of: 

a) growing a transformed host cell containing a vector 
according to claim 6 in a suitable nutrient medium; and 

b) isolating the CVJI polypeptide or variant thereof from 

said host cell. 

11. The method of claim 10 wherein said host cell is E. colL 

12. A recombinant CvUl polypeptide. 

13. A polypeptide produced by the method of claim 10. 

14. A method for restriction endonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is cleaved at a dinucleotide sequence selected 
from the group consisting of PyGCPy, PuGCPy, PuGCPu, and wherein Pu = 
purine and Py = pyrimidine. 

15. A method for restriction endonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is digested at 11 of 16 possible dinucleotide 
sequences and wherein said dinucleotide sequences are selected from the group 
consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, and wherein Pu = 
purine and Py = pyrimidine. 



16. The method according to claim 14 wherein said restriction 
endonuclease reagent comprises CviJ I. 
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17. A restriction endonuclease reagent, said restriction 
endonuclease reagent comprising in combination, Taq I and Hpa II (CGase I), 
said reagent capable of digesting DNA at 11 of 16 possible dinucleotide 
sequences, said sequences selected from the group consisting of PuCGPu, 
PuCGPy , PyCGPy and PyCGPu, and wherein Pu = purine and Py = pyrimidine. 

18. The method according to claim 15 wherein said restriction 
endonuclease reagent is selected from the group consisting of Aci I and CGase I. 

19. The method according to claim 16 wherein said digestion 
of DNA is a partial digestion and wherein said digestion generates quasi-random 
fragments of DNA without apparent site preference as seen on a 1-2 wt. % agarose 
gel. 

20. The method according to claim 18 wherein said digestion *>f 
DNA is a partial digestion and wherein said digestion generates quasi-random 
fragments of DNA without apparent site preference as seen on a 1-2 wt. % agarose 
gel. 

21. The method according to claims 16 or 18 wherein said 
digestion is complete, and wherein said digestion generates DNA fragments from 
about 20 base pairs in length to about 200 base pairs in length and wherein said 
fragments have an average length of about 20 to about 60 nucleotides. 

22. The method according to claims 19 or 20 wherein said quasi- 
random fragments are from about 100 basepairs to about 10,000 base pairs in 
length. 
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23. A method for shotgun cloning and sequencing DNA, 
comprising the steps of: 

a) partially digesting DNA according to claims 19 or 20; 

b) ligating said partially digested DNA into a linearized 
cloning vector thereby creating a recombinant vector; 

c) introducing said recombinant vector into a host cell; 

d) selecting said host cell for the presence of said recombinant 
vector; 

e) growing and amplifying said host cell containing said 
recombinant vector; 

f) isolating and purifying said recombinant vector from said 
grown and amplified host cells; and 

g) sequencing said DNA contained in said recombinant vector. 

24. The method according to claim 23 wherein said restriction 
endonuclease reagent comprises CvU I. 

25. The method according to claim 23 wherein said restriction 
endonuclease reagent comprises CGase I. 

26. The method according to claim 23 wherein said quasi-random 
fragments are from about 100 base pairs to about 10,000 base pairs in length. 

27. The method according to claim 23 wherein said quasi-random 
fragments are from about 500 bp to about 2,000 bp in length. 



28. The method according to claim 23 wherein said cloning 
is selected from the group consisting of plasmids, phage, and cosmids. 



vector 
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29. The method according to claim 28 wherein said plasmid is 

pUC19. 

30. The method according to claim 28 wherein said bacteriophage 

is \. 

31. The method according to claim 28 wherein said bacteriophage 

is M13. 

32. The method according to claim 23 wherein said host cell is a 

bacteria. 

33. The method according to claim 32 wherein said host cell is E. 

coli. 

34. The method according to claim 23 wherein said sequencing is 
dideoxy sequencing. 

35. A kit for the shotgun cloning of DNA, said kit comprising in 

association: 

a) a restriction endonuclease reagent, according to 
claims 16 or 18; 

b) a restriction endonuclease buffer; 

c) ligation buffer; and 

d) T4 DNA ligase. 
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36. The kit of claim 35 further comprising in association: 

e) competent host bacteria; 

f) chromatography matrix said matrix useful for the size 
selection of restriction endonuclease digested DNA; 

g) spin filters, said spin filters useful for the size selection of 
restriction endonuclease digested DNA; 

h) a cloning vector; 

i) positive control DNA useful in the monitoring of the 
efficiency of the said shotgun cloning; and 

j) molecular size marker DNA. 

37. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CviJ I. 

38. The kit according to claim 37 wherein said restriction 
endonuclease buffer endonuclease buffer is CviJ I** buffer. 

39. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CGase I. 

40. The kit according to claim 39 wherein said restriction 
endonuclease buffer is CGase I buffer. 

41. The kit according to claim 36 wherein said competent host 
bacteria is competent E. coli DH5aF. 

42. The kit according to claim 36 wherein said chromatography 
matrix is Sephacryl-S500. 
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43. The kit according to claim 36 wherein said cloning vector is 

M13 mpl8. 

44. A method for labeling DNA, the method comprising the steps 

of: 

a) digesting an aliquot of template DNA with a restriction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mixing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and sequence-specific DNA fragments thereby 
generating denatured template DNA and oligonucleotide primers. 

c) annealing said primers to said denatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymerase in the presence of 
one or more nucleotide triphosphates and wherein at least one 
nucleotide triphosphate has a label. 

45. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises Cvil I. 

46. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises CGase I. 

47. The method according to claim 44 wherein said extension 
reaction is performed by a DNA polymerase. 
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48. The method according to claim 47 wherein said DNA 
polymerase is Thermits flaws DNA polymerase. 

49. The method according to claim 44 wherein the one or more 
nucleotide triphosphates are selected from the group consisting of dATP, dCTP, 
dGTP, dUTP and dTTP. 

50. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of 32 P-labeled 
nucleotide triphosphates and 33 P-labeled nucleotide triphosphates. 

51. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of biotin-labeled 
nucleotide triphosphates, florescein-labeled nucleotide triphosphates, 
dinitrophenol-labeled nucleotide triphosphates, and digoxigenin-labeled nucleotide 
triphosphates. 
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52. A method for thermal cycle labeling DNA comprising the 

steps of: 

a) digesting an aliquot of template DNA with a restriction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mixing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and said DNA fragments thereby generating 
denatured template DNA and oligonucleotide primers; 

c) annealing said primers to said denatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymerase in the presence of 
one or more nucleotide triphosphates and wherein at least one 
nucleotide triphosphate has a label. 

e) heat-denaturing said labeled extension products; 

f) reannealing said excess primers with said template DNA 
and with said extension products; 

g) performing at least one additional extension reaction from 
said DNA-primer complex using a DNA polymerase. 

53. The method according to claim 52 wherein said restriction 
endonuclease reagent comprises CvLT I. 

54. The method according to claim 52 wherein said restriction 
endonuclease comprises CGase I. 

55. The method according to claim 52 wherein said DNA 
polymerase is a heat stable DNA polymerase. 
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56. The method according to claim 55 wherein said heat-stable 
DNA polymerase is Thermits Jlavus DNA polymerase or a functional fragment 
thereof. 

57. The method according to claim 52 wherein said extension 
products also serve as templates. 

58. The method according to claim 52 wherein said label is 
selected from the group consisting of fluorescein, dinitrophenol, biotin, and 
digoxigenin. 

59. The method according to claim 52 wherein said label is 
selected from the group consisting of 32 P, 33 P, hi, 14 C, and 35 S. 

60. The method according to claim 52 wherein steps e)-g) are 
repeated up to 20 times. 

61. A kit for labeling DNA, said kit comprising in association: 

a) a restriction endonuclease reagent, according to 
claims 16 or 18; 

b) a restriction endonuclease buffer; and 

c) a labeling buffer. 

62. The kit according to claim 61 wherein said restriction 
endonuclease reagent comprises CvU I. 

63. The kit according to claim 62 wherein said restriction 
endonuclease buffer is CviJ I* restriction endonuclease buffer. 



^OOCID: <WO 9421663A1_I_> 



WO 94/21663 



PCT/US94/03246 



1 05 



64. The kit according to claim 61 wherein said restriction 
endonuclease reagent is selected from the group consisting of CGase I and Aci I. 

65. The kit according to claim 64 wherein said restriction 
endonuclease buffer is CGase I buffer. 

66. The kit of claim 64 further comprising: 

d) a concentrated mixture of 1 or more nucleotide 
triphosphates; 

e) a DNA polymerase; 

f) control DNA, said control DNA being useful for monitoring 
the efficiency of labeling. 

67. The kit according to claim 66 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides selected from the group 
consisting of dCTP, dTTP, dATP, and dGTP. 

68. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin-1 1-dUTP, digoxigenin-1 1- 
dUTP and fluorescein- 1 1-dUTP. 

69. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of 32 P-labeled nucleotides, 33 P- 
labeled nucleotides, 14 C-labeled nucleotides, 35 S-labeled nucleotides, and 3 H- 
labeled nucleotides. 

70. The kit according to claim 66 wherein said DNA polymerase 
is the Klenow fragment of DNA polymerase 1. 
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71. The kit according to claim 66 wherein said DNA polymerase 
is a thermostable DNA polymerase. 

72. The kit according to claim 66 wherein said thermostable DNA 
polymerase is Thermits flavus DNA polymerase. 

73. A method for universal thermal cycle labelling DNA 
comprising the steps of: 

a) mixing an aliquot of template DNA with a holo- 
enzyme of a thermostable DNA polymerase, whereby the 
polymerase provides endogenously purified DNA primers; 

b) denaturing said mixture of template DNA and said 
endogenous DNA primers; 

c) annealing said mixture of denatured template DNA 
and said endogenous DNA primers to form a DNA-primer 
complex; 

d) performing an extension reaction from said 
endogenous DNA primers in said DNA-primer complex 
using said DNA polymerase in the presence of one or more 
nucleotide triphosphates and wherein at least one nucleotide 
triphosphate has a label; 

e) heat-denaturing said labeled extension products; 

f) reannealing said endogenous primers with said 
template DNA and with said extension products; 

g) performing at least one additional extension reaction 
from said DNA-primer complex using a DNA polymerase. 



INSDOCIO: <WO 9421663A1J. 



WO 94/21663 



PCT/US94/03246 



1 07 



74. The method according to Claim 73 wherein said heat-stable 
DNA polymerase is Thermits flaws DNA polymerase or a functional fragment 
thereof. 

75. The method according to claim 73 wherein said extension 
products also serve as templates. 



76. The method according to claim 73 wherein said label is 
selected from the group consisting of fluorescein, dinitrophenol, biotin, and 
digoxigenin. 



77. The method according to claim 73 wherein said label is 
selected from the group consisting of 32 P, 33 p, 3 H, 14 C, and 35 S. 

78. The method according to claim 73 wherein steps e)-g) are 
repeated up to 20 times. 

79. A kit for labeling DNA, said kit comprising in association: 

a) a holo-enzyme of a thermostable DNA polymerase; 
and 

b) a DNA polymerase buffer. 

80. The kit of claim 79 further comprising: 

c) a concentrated mixture of 1 or more nucleotide 
triphosphates; 

d) control DNA, said control DNA being useful for monitoring 
the efficiency of labeling. 
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81. The kit according to claim 80 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides selected from the group 
consisting of dCTP, dTTP, dATP, and dGTP. 

82. The kit according to claim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin-1 1-dUTP, digoxigenin-1 1- 
dUTP and fluorescein- 11-dUTP. 

83. The kit according to claim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of 32 P-labeled nucleotides, 33 P- 
labeled nucleotides, 14 C-labeled nucleotides, 35 S-labeled nucleotides, and 3 H- 
labeled nucleotides. 

84. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Thermits aquaiicus DNA polymerase. 

85. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Thermus flavus DNA polymerase. 

86. A method for labeling of restriction-generated oligonucleotides, 
the method of comprising the steps of: 

a) digesting an aliquot of template DNA according to 
claim 21; 

b) heat denaturing said digested DNA thereby generating 
sequence-specific oligonucleotides; and 

c) labeling said sequence-specific oligonucleotides with 
a label capable of detection. 
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87. The method according to claim 86 wherein said restriction- 
generated oligonucleotides are labeled on the 5' end. 

88. The method according to claim 86 wherein said restriction- 
generated oligonucleotides are labeled on the 3' end. 

89. The method according to claim 86 wherein the label is 

radioactive. 

90. The method according to claim 86 wherein the label is non- 
radioactive. 
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91. A method for anonymous primer cloning, the method 
comprising the steps of: 

a) digesting an aliquot of template DNA according to claim 21 
thereby generating anonymous DNA fragments; 

b) digesting a plasmid cloning vector with a restriction 
endonuclease thereby creating a cloning site for insertion of said 
anonymous DNA fragments; 

c) ligating the anonymous DNA fragments of step a) into the 
cloning site of step b) thereby creating recombinant plasmids; 

d) transforming competent bacteria with the recombinant 
plasmids; 

e) selecting trasformed colonies; 

f) purifying the recombinant plasmids from said transformed 
bacteria; 

g) digesting the recombinant plasmid with a restriction 
endonuclease said restriction endonuclease being capable of cutting 
said recombinant plasmid at a site, said site lying within the cloned 
anonymous DNA fragment; 

h) annealing one or more extension primers to the digested 
recombinant plasmid, said extension primers being complementary 
to plasmid sequences flanking the anonymous primer; 

i) extending the extension primer in a template-dependent 
fashion in the presence of one or more nucleotide triphosphates and 
a DNA polymerase; and 

j) denaturing the said hybridized extended primer. 

92. The method according to claim 91 wherein said restriction 
endonuclease reagent comprises CvU I. 
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93. The method according to claim 91 wherein said restriction 
endonuclease reagent comprises CGase I. 

94. The method according to claim 91 wherein said plasmid 
cloning vector is pFEM. 

95. The method according to claim 94 wherein the restriction 
endonuclease of step b) is Eco RV. 

96. The method according to claim 91 wherein said extension 
primer has a label capable of detection. 

97. A kit for anonymous primer cloning comprising in association: 

a) a restriction endonuclease reagent, according to claims 16 or 
18; 

b) a restriction endonuclease buffer; 

c) a cloning vector; 

d) competent bacteria; 

e) one or more extension primers said extension primers being 
complementary to plasmid sequences flanking said anonymous 
primers; and 

f) a DNA polymerase reagent. 

98. The kit according to claim 97 wherein said restriction 
endonuclease reagent comprises CvU I. 

99. The kit according to claim 98 wherein said restriction 
endonuclease buffer is CviJ I* buffer. 
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100. The kit according to claim 97 wherein said restriction 
endonuclease reagent is selected from the group consisting of CGase I and Aci 1. 

101. The kit according to claim 100 wherein said restriction 
endonuclease buffer is CGase I buffer. 

102. The kit according to claim 97 wherein said cloning vector is 

pFEM. 
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1201 AAAACAATTG 
1261 AGAGAAAAAA 
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1521 TACTTTTGTA 
1601 GTAAAGCGAA 
1661 CCTAAAGATC 
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1841 CTGCTAATAA 
1921 AAACCAGGAA 
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TAATGGATAA 
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CGCTTTTCAA 
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TTTTAATGGT 
CGGCGTTACA 
6TCAACAAAT 
GATCGTTTCG 
GAAC6ACGAT 
CGAACTACTG 
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CAATACAGAC 
TAAATTCAAC 
TAGTAACCCA 
AAC6ATCCT6 
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ATGACATTAT 
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CATC6CCATT 
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868 TTC GAG AAT TTT ACT CCG AAA CAT CAA CCC CTT TTC TAT CTA ACA ACA CAA CCC ACT ACT ^ 

1028 AGC GCG ACA AGT ATA ACA ACT CTC CTC AAT CGT GTC ACT TAT AAA AAT TTA AGA TTC TTT 

1088 ATA CAT CCA TAC AAC TTT CTT TCT TCA AAA ACA CAA CGT ATT ATG TAG CACCATTTTCCCCAC 

1152 ACACTTTGTTGACCCCGTACTAAAAAATGSTCACCATA—3TCTAAACATCCTCATACAAGCAGCTCCAAACCTTCAC 
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Gene M. CviJI from Chlorella Virus IL-3A", pages 16-24, 
especially Figure 3. 



1-11 
1-11 
1-11 



1-11 



□ 



fa the cootmuatkm of Box C. |~| See patent family annex. 
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BOX D OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
Ttui ISA found multiple invention* u follows: 

tar — — **. - ^« o*. 4 

» Clui 435, rabchMlTCjTbr onuiple •«««>■»««» prate* deoing after digestion with Cvifl. cUeeiral 

Detailed Reason, for Lack of Unity 



invent- !f^/!^!!L 13 ?* ^ P™^ 16 ° f ""^ ° f « v «*>*>" *« «n application dvould relate Co only one 

The thirteen inventions of this application consist of: 

1) » polynucleotide encoding CvUI, the vector comohiin ff k ih**««_f rt 1 1^- 

making the protein usinglthe vector. ^ 8 ' theSMa(om ^ te « «"y««8 ihe vector, and . method of 

2) the recombinant peptide CvUI, 

3) a metho d for re striction cndonucleaae digestion using CvUI 

3 a C o^ r ^^°r i ^ k ^ tnd * meth0<Jfor "** * '» ^triction cndonucleaae digestion 

2 ™* ,hot « un ctoiun « PMtial digestion using CvUI, * 

*) t method for shotgun cloning after partial digestion using CGasc I 

7) a method of extension labeling of DNA and thermal cycle labeling using CvUI 

J ft method of mention labeling of DNA and thermal ejele UbelinJ usinf CCasel 

9) a universal thermal cycle labeling of DNA. 

10) a method of cad labeling after CvUI digestion, 

11) a method of end heeling after CGaae I digestion, 

12) a method for anonymous primer cloning after digestion with CvUI, and 
' * metnod for *nonymous primer cloning after digestion with CGasc I. 

*e 'oUoJJ^.™^ *e tn-ning of Per Rule „ fc , 

j. no teeluued reUlionfnip ^^J^^L^^TJZ^r^ * ' *T~ *~ 
feature. " w WAMl onc common or corresponding special technical 
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constitute a special technical feature within the meaning of PCT Rule 13.2. 

The methods for restriction endonuclease digestion, shotgun cloning and sequencing with CviJI, for extension 
and thermal cycle labeling with CviJI. for universal cycle labelling, for end labeling after CviJI digestion, and for 
anonymous primer cloning after CviJI digestion involve a corresponding technical feature, digestion with CviJI. that 
does not define the .contribution which each claimed invention, considered as a whole, makes over the prior an because 
restriction endonuclease digestion, and shotgun cloning and sequencing, extension and thermal cycle labeling after rest, 
universal cycle labelling, end labeling, and anonymous primer cloning alter restriction endonuclease digestion are well 
known in the art. In addition, CviJI is also known in the art Accordingly, such does not constitute a special technical 
feature within the meaning of PCT Rule 13.2. 

Similarly, the methods for restriction endonuclease digestion, shotgun cloning and sequencing with CGasel, 
for extension and thermal cycle labeling with CGasel, for universal cycle labelling, for end labeling after CGasel 
digestion, and for anonymous primer cloning after CGasel digestion involve a corresponding technical feature, digestion 
with CGasel, that does not define the contribution which each claimed invention, considered as a whole, makes over the 
prior art because restriction endonuclease digestion, and shotgun cloning and sequencing, extension and thermal cycle 
labeling after rest,unrversal cycle labelling, end labeling, and anonymous primer cloning after restriction endonuclease 
digestion are well known in the art Accordingly, such does not constitute a special technical feature within the 
meaning of PCT Rule 13 .2. 
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